NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2E6TDBWCDE91SKTQB0R441
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
11.8s
Cost:
$0.000231
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
40.6364%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The air in the back of the van smelled like stale cheese puffs and failure. Bill slumped against the stack of inflatable pirate swords, his face illuminated by the glow of a tablet showing a Pinterest board titled “Rainbow Unicorn Extravaganza.” “Look at this,” he whispered, voice crackling with awe. “They have a bouncy castle shaped like a birthday cake. A cake.” Molly didn’t look up from the blueprints. “Focus. The party starts in three hours. We have one window of opportunity during the ‘Silly Sing-Along.’ The parents will be doing hand motions. They won’t see a thing.” “But the bouncy cake,” Bill insisted. “Is not our target. Our target is the goody bags. The gift table. The piñata once they crack it open. We take the loot, we vanish. Clean. No glitter on our shoes.” Bill sighed, dropping the tablet. “You’re no fun. I wanted to at least eat one of the cupcakes. They’re themed as unicorn poop, you know. Rainbow sprinkles.” “You will eat nothing. You will wear the janitor disguise. You’ll push a mop bucket. If anyone asks, you’re there to deal with a ‘pudding incident’ we stage.” “Pretend I’m scrubbing up a pudding disaster. While you—what? Waddle in as the birthday clown?” Molly smiled, slow and dangerous. “I’m the face painter. Certified. I have a license and everything.” “You forged a face painting license?” “I earned it. Took a weekend course in Akron. It’s a legitimate career path, Bill. You’d be surprised how much trust a butterfly on a cheek buys you.” Bill scratched his stubble. “So I mop, you paint, and we both grab presents when the ‘Silly Sing-Along’ hits the chorus of ‘I’m a Little Teapot’?” “Exactly. The song is short. We have forty-five seconds. I’ll snatch the gift bags. You roll the piñata into the janitor closet. We meet at the back exit.” “What about the gift table? The big wrapped ones? The parents spent real money on those.” Molly tapped the blueprints. “Too heavy. Too traceable. Unless it’s an iPad. If you see an iPad-shaped box, you pocket it. Otherwise, stick to the goody bags. They’re filled with plastic crap and candy, but fifty bags at twenty dollars a pop? That’s a grand.” Bill nodded slowly. “And the unicorn poop cupcakes?” “We’ll buy a box at the grocery store after. I promise.” “Deal.” He reached out a hand. Molly shook it, her grip firm. “Now remember,” she said, “the party is at the community center. Room 4B. The theme is unicorns. There will be tears. There will be sugar highs. We will use both to our advantage.” Bill grinned. “And if we get caught?” “We don’t.” “But if we do?” Molly picked up a tube of blue face paint. “Then I’ll say I’m teaching you a lesson about stranger danger. Very convincing. I even have a pamphlet.” Bill laughed, and the van smelled a little less like cheese puffs.
created_at
"2026-04-25T13:45:31.311Z"
4/25/2026, 1:45:31 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
593
moderation_latency
(null)
generation_time
11758
tokens_prompt
77
tokens_completion
874
native_tokens_prompt
67
native_tokens_completion
793
native_tokens_completion_images
(null)
native_tokens_reasoning
77
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00023142
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"7a10c877-3713-4218-bb7c-553fb1557307"
is_byok
false
latency
270
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124731-7XGhl6Do9WeU4oh58cKN"
session_id
(null)
api_type
"completions"
id
"gen-1777124731-uw2MRMj0t1x1stliUiqB"
upstream_id
"7a10c877-3713-4218-bb7c-553fb1557307"
total_cost
0.00023142
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
488
0.00%
Dialogue to Total Word Ratio
Ratio: 77.64%, Deviation: 47.64%
neededClean
false
wordsTotal
492
wordsDialogue
382
40.6364%