NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2G77SHHMZGPDZGYMSCZ1G2
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
9m 58s
Cost:
$0.020053
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
20.19%
Matches word count
(details)
99.78%
Dialogue to Total Word Ratio
(details)
59.9860%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van was a tetanus-riddled time capsule, parked beneath a sycamore whose roots had cracked the sidewalk into a jigsaw puzzle. The engine ticked like a cooling gun barrel. The air inside was thick with the ghosts of fast food and bad decisions. Molly peered through binoculars at the Henderson residence, where a children’s party unfolded with the tactical chaos of a beach invasion. A bouncy castle shaped like a unicorn swayed in the breeze, its horn drooping like a disappointed parent. A clown with the haunted eyes of a war veteran twisted a balloon into a question mark. The gift table, draped in a pirate-themed cloth, held a treasure trove of boxes shimmering with the kind of cellophane that screamed guilt-purchasing. Bill sat in the passenger seat, dissecting a string cheese with surgical precision. His therapy cat, Professor Snuggles, was back at the apartment, but her imagined voice in his head was delivering a stern lecture on moral relativism. He glanced at the pile of presents. “I’ve never stolen from people who still believe in Santa. It feels like a new low.” Molly didn’t lower the binoculars. “Santa is a gateway myth to consumerism. We’re doing them a pedagogical favor.” She had already mapped the party’s rhythms: the magician’s final trick at three, the piñata at three-fifteen, and Uncle Gary’s inevitable bathroom break at three-ten, his walnut-sized bladder a clockwork marvel. The pony ride was a rented disaster, one miniature horse already attempting to eat a balloon. “Now, here’s the plan. We go in as entertainers. I’m a fortune teller. You’re a human statue painted gold.” Bill flinched. “A statue? I have the balance of a newborn giraffe and a fear of sticky fingers. Last time I stood still, my leg fell asleep and I screamed.” Molly ticked points on her fingers. “You’ll stand by the gift table. Kids will be mesmerized. I’ll read palms. During the piñata, you fill the pockets. Puddles is harmless.” Molly snorted. “He’s been crying into his balloon animals for an hour.” Bill chewed his cheese. “What if I sneeze? Gold paint and mucus—won’t that compromise the statue illusion?” “You won’t sneeze. I’ve dosed you. Your tear ducts will be statuesque.” Molly tossed a crumpled golden bundle into his lap. The fabric shimmered like a disco ball at an execution. “Put on the leotard. The piñata is a donkey, and in twenty minutes, a blindfolded tyrant will bludgeon it. That’s our cue.” Bill held the leotard, its spandex glinting like a second skin of pure absurdity. A profound weariness settled in his bones, the kind that came from a decade of following Molly’s hypnotic, lunatic plans. He thought of Professor Snuggles’s imagined disapproval, then of the remote-controlled dinosaur he could buy with his share. Resolve, or perhaps just resignation, hardened his features into a mask of tragic acceptance. He began to unbutton his shirt with the solemnity of a man preparing for ritual sacrifice, the van’s cracked window framing the alpaca in the petting zoo as it stared at them with a knowing, judgmental gaze, a single tear rolling down its fuzzy snout.
created_at
"2026-04-25T14:20:42.169Z"
4/25/2026, 2:20:42 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1403
moderation_latency
(null)
generation_time
597638
tokens_prompt
77
tokens_completion
21004
native_tokens_prompt
67
native_tokens_completion
23016
native_tokens_completion_images
(null)
native_tokens_reasoning
22282
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.020053065
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"efaeed46-9766-4ef9-8244-1a1817072f8f"
is_byok
false
latency
268
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777126842-ljdBjA04yLg9OrEjalvd"
session_id
(null)
api_type
"completions"
id
"gen-1777126842-A540fc1XWr3PY32szzG0"
upstream_id
"efaeed46-9766-4ef9-8244-1a1817072f8f"
total_cost
0.020053065
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
20.19%
Matches word count
n/a
neededClean
false
words
520
99.78%
Dialogue to Total Word Ratio
Ratio: 29.79%, Deviation: 0.21%
neededClean
false
wordsTotal
527
wordsDialogue
157
59.9860%