NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ91EGK1ZJ9G0T3F6G0CS17
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.3s
Cost:
$0.010710
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
10.32%
Dialogue to Total Word Ratio
(details)
55.1195%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain came down in sheets across the parking lot of the Funzone Family Entertainment Center, which was arguably the most aggressively cheerful building Molly had ever planned a crime against. She sat hunched over a hand-drawn map in the passenger seat of Bill's rusting van, squinting at her own chicken-scratch handwriting while Bill ate a gas station sandwich with the enthusiasm of a man who had given up caring about things. "The bouncy castle is the problem," Molly said, tapping her pen against the paper. "It's blocking the side entrance." Bill chewed thoughtfully. "Go through the bouncy castle." "I'm not going through the bouncy castle, Bill." "Why not." "Because I'm thirty-seven years old and I have a bad knee and some shred of dignity left." Bill finished his sandwich and crumpled the wrapper into a ball, tossing it somewhere into the black hole of fast food debris that had colonized the backseat over what appeared to be several geological epochs. He leaned over to study the map, leaving a small mayonnaise thumbprint on the corner that Molly chose not to address. The target was a birthday party for one Sebastian Hornsworth-Finch, who was turning eight years old and whose parents had apparently invited forty-three children, each of whom had arrived carrying a gift bag or wrapped box of considerable size and probable expense. Molly had done reconnaissance. She had stood outside in sensible shoes pretending to look for a lost dog, counting presents as they came through the front door like some kind of depressing department store Santa. "There's at least sixty presents in that side room," she said. "Maybe more." "Sixty," Bill repeated, with the reverent tone of a man hearing poetry. "The party ends at four. The parents will be half-delirious from cake and screaming children. We go in at three-fifty, through the kitchen entrance, load the bags, and walk out the fire door." Bill nodded slowly, which was his way of conveying either deep agreement or that he had momentarily stopped paying attention. With Bill, the outcomes were often identical. The issue, which Molly had been carefully not thinking about, was that neither of them had any particular experience stealing from children's parties. Their previous jobs had involved warehouse electronics, a jewelry store that turned out to be mostly costume pieces, and one deeply confusing incident with a taxidermy collection that they never discussed anymore. Children's parties were new territory, which meant unpredictable variables. Children were themselves unpredictable variables. Small, shrieking, cake-smeared variables who moved in sudden directions. "What if a kid sees us," Bill said. "Tell them you're the magician's assistant." "I don't look like a magician's assistant." "You're wearing a cape." "It's a poncho," Bill said, with great dignity. Molly looked at him. She looked at the poncho. She decided that this was not a battle worth fighting, tucked her map into her jacket, and watched the rain blur the neon lights of the Funzone into something almost beautiful.
created_at
"2026-02-18T02:25:24.296558+00:00"
2/18/2026, 2:25:24 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
659
moderation_latency
(null)
generation_time
19250
tokens_prompt
77
tokens_completion
798
native_tokens_prompt
75
native_tokens_completion
699
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01071
router
(null)
provider_responses
0
id
"msg_01LhEw36m2w92uuihNDcRUQE"
status
200
is_byok
false
latency
659
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381504-uu0q3GcsvpPsGGF9kxbx"
upstream_id
"msg_01LhEw36m2w92uuihNDcRUQE"
total_cost
0.01071
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
503
10.32%
Dialogue to Total Word Ratio
Ratio: 23.26%, Deviation: 6.74%
neededClean
false
wordsTotal
503
wordsDialogue
117
55.1195%