NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5WH17HYCWJ0R3PWR2MMZ6
Run Details
Model:
DeepSeek V3.2
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
20.2s
Cost:
$0.000132
Provider:
Novita
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse loft smelled of dust and desperation. Molly traced a finger over the blueprints spread on the crate between them. "The target is a seven-year-old's birthday party at 'Jumping Jupiter,' a soft-play arena. Security is a single, easily distracted teenager named Kevin." Bill polished a butter knife he’d fashioned into a shim. "A soft heist. I like it. In and out during the chaos of the cake ceremony." "Precisely. The presents are stored in a locked supply closet adjacent to the 'Toddler Tornado' zone. Your job is to create a diversion." Bill grinned. "I was thinking a sudden, tragic piñata malfunction. A cascade of candy, children screaming with joy. It’s beautiful chaos." Molly nodded, tapping the closet on the plan. "While Kevin is buried in four-year-olds, I pop the lock. We swap the gift bags for these identical ones filled with weighted newspaper." "And the loot?" "Gym socks, action figures, and at least three copies of the same educational DVD. We fence it all online. The parents get the insurance, the kids get a lesson in commercial disappointment, and we get a modest profit." She looked at him. "Any moral qualms?" Bill sheathed his butter knife. "Molly, stealing from children isn't a crime. It's advanced parenting."
created_at
"2026-02-16T21:32:35.147746+00:00"
2/16/2026, 9:32:35 PM
model
"deepseek/deepseek-v3.2-20251201"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1162
moderation_latency
(null)
generation_time
10122
tokens_prompt
77
tokens_completion
328
native_tokens_prompt
67
native_tokens_completion
284
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000131623
router
(null)
provider_responses
0
status
(null)
is_byok
false
latency
10077
endpoint_id
"b39a588a-b338-43ba-b046-226b9e24afa8"
provider_name
"SiliconFlow"
model_permaslug
"deepseek/deepseek-v3.2-20251201"
1
id
"1a5749e8ef1f859bdea4695c1e0f7a4f"
status
200
is_byok
false
latency
1162
endpoint_id
"8a5fb6d3-ef71-4958-af54-269ffa3a3c6e"
provider_name
"Novita"
model_permaslug
"deepseek/deepseek-v3.2-20251201"
api_type
"completions"
id
"gen-1771277534-8A4t0z5ZpTQp0WmeP4NM"
upstream_id
"1a5749e8ef1f859bdea4695c1e0f7a4f"
total_cost
0.000131623
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
81.2727%
Matches word count
n/a
neededClean
false
words
212
0.0000%
Dialogue to Total Word Ratio
Ratio: 76.42%, Deviation: 26.42%
neededClean
false
wordsTotal
212
wordsDialogue
162
40.6364%