NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5WW5DCHEYWVVWFG0W78SQ
Run Details
Model:
DeepSeek V3 (2024-12-26)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
12.9s
Cost:
$0.000321
Provider:
DeepInfra
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly leaned against the hood of their beat-up van, arms crossed. "Okay, so we hit the party at 3 PM sharp. Everyone will be distracted by the clown." Bill squinted at the crumpled map in his hands. "A clown? You didn’t mention a clown, Moll. I hate clowns." "Relax, it’s just a guy in makeup. He’ll be juggling or something. We grab the presents, toss ’em in the van, and vanish. Easy." "Easy? What if the kids see us? You know how kids are—they scream. And parents *notice* screams." "So we dress up. Wear a mouse costume or something. Blend in." Bill stared at her. "A mouse costume? I’m not wearing a mouse costume. What’s next, you gonna ask me to wear ears and squeak?" Molly smirked. "Actually, squeaking might help. Always good to commit to the role." Bill groaned. "Why’d I agree to this? Stealing from a kids’ party feels... wrong." "Wrong? Bill, those presents are wrapped in shiny paper just screaming to be stolen. It’s not wrong—it’s efficient." "Yeah, but what if we get caught? They’ll throw us in jail, and I’ll have to share a cell with someone who *likes* clowns." Molly patted his shoulder. "Stay focused, Bill. Think of the loot. Think of the glory. Think of... not getting caught." Bill sighed. "Fine. But I’m not squeaking."
created_at
"2026-02-16T21:32:39.573659+00:00"
2/16/2026, 9:32:39 PM
model
"deepseek/deepseek-chat-v3"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
269
moderation_latency
(null)
generation_time
12847
tokens_prompt
77
tokens_completion
343
native_tokens_prompt
66
native_tokens_completion
337
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00032105
router
(null)
provider_responses
0
id
"chatcmpl-RuiWmgxAA81feuo2ZJhbBjUo"
status
200
is_byok
false
latency
39
endpoint_id
"5294d55f-9012-496b-8f22-8cc919432dcd"
provider_name
"DeepInfra"
model_permaslug
"deepseek/deepseek-chat-v3"
api_type
"completions"
id
"gen-1771277545-ObOaGtvHLmKCd4pc5sAY"
upstream_id
"chatcmpl-RuiWmgxAA81feuo2ZJhbBjUo"
total_cost
0.00032105
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
3.6234%
Matches word count
n/a
neededClean
false
words
224
90.4735%
Dialogue to Total Word Ratio
Ratio: 84.38%, Deviation: 5.63%
neededClean
false
wordsTotal
224
wordsDialogue
189
47.0485%