NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2HTY9BVZ8WXPEMG3W0PD1F
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
12.6s
Cost:
$0.000242
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.49%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.2460%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the binoculars, peering through the rain-streaked window of the van. “Okay, Bill, let’s run it again. The party starts at two. The birthday boy is a seven-year-old named Kevin. His mother, a woman who once caught a raccoon with her bare hands, will be distracted by the magician at three-fifteen.” Bill, wedged between a broken fog machine and a bag of rubber chickens, sighed. “Molly, we’ve run it twelve times. I sneak in via the inflatable slide, grab the gift table, exit through the bouncy castle’s emergency flap. It’s a solid plan.” “It’s a solid plan for a bank, Bill. This is a birthday party. There are tiny chairs. And juice boxes. And a piñata that we are absolutely not hitting because I don’t need another sugar-induced tantrum from you.” “That was one time. And the candy was delicious.” “You cried for an hour because you got a Dum Dum and I got a Blow Pop.” Molly lowered the binoculars and turned to face him. “The key variable is the clown. I’ve done recon. He’s got a balloon-animal station right next to the presents. He’s also got a squeaky nose, which means he’s always audible. If you can time your entrance with one of his really loud squeaks, you’ll be golden.” Bill nodded gravely. “I’ll need a diversion. Something that draws every child’s attention away from the table for at least forty-five seconds.” Molly smiled slowly. “I’ve already procured a remote-control robot dressed as a giant slice of pizza. It will roll into the party, play the ‘Happy Birthday’ song on a kazoo, and then self-destruct in a cloud of glitter. That’s your window.” “Glitter?” Bill’s eyes widened. “That stuff never comes out. We’ll be leaving a trail of evidence all over the neighborhood.” “We’ll be leaving a trail of joy, Bill. It’s a children’s party. They’ll think it’s part of the show.” She reached into her coat and pulled out a yellowed map. “The present table is here, wedged between a tower of juice boxes and a life-sized cardboard cutout of a cartoon llama. The llama is suspicious. I think it has a camera in its eye.” Bill squinted. “That’s just a drawing of a llama, Molly. It’s two-dimensional.” “You don’t know that. The parents these days are savage. They’ve got nanny cams in teddy bears, GPS in party favors. I’m telling you, we need to go in wearing these.” She handed him a pair of oversized sunglasses with a fake mustache attached. “We’re stealing presents from children, not robbing a bank in a 1980s movie.” “Exactly. Children are the hardest marks. They see through everything. Except mustaches. They love mustaches.” Molly checked her watch. “We go in at two-forty-seven. That’s when the magician starts his dove trick. All eyes will be on a pigeon that he’s going to accidentally release into the ceiling fan. It’s chaos, Bill. Pure, beautiful chaos.” Bill looked at the pizza-robot, the map, the mustache glasses. He took a deep breath. “Alright. But if I get glitter in my hair again, you’re the one shaving it off.” “Deal,” Molly said, and cracked open a juice box. “To Kevin’s presents.”
created_at
"2026-04-25T14:48:56.367Z"
4/25/2026, 2:48:56 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
514
moderation_latency
(null)
generation_time
12445
tokens_prompt
77
tokens_completion
903
native_tokens_prompt
67
native_tokens_completion
832
native_tokens_completion_images
(null)
native_tokens_reasoning
60
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00024234
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"7398fc67-b105-43b5-a390-ddf80e887f01"
is_byok
false
latency
269
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777128536-pUQn67Wcwd1QtvZTv78X"
session_id
(null)
api_type
"completions"
id
"gen-1777128536-EHqt21j4D7YT4mv5D3bb"
upstream_id
"7398fc67-b105-43b5-a390-ddf80e887f01"
total_cost
0.00024234
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.49%
Matches word count
n/a
neededClean
false
words
527
0.00%
Dialogue to Total Word Ratio
Ratio: 81.70%, Deviation: 51.70%
neededClean
false
wordsTotal
541
wordsDialogue
442
0.2460%