NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8W3HZXJE2KMQJ3S1WPCDA6
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3m 40s
Cost:
$0.305631
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
27.69%
Dialogue to Total Word Ratio
(details)
59.0890%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the rusting van and smoothed its creases with the reverence of a general preparing for battle. The map had been sketched on the back of a grocery receipt, meaning the Hendersons' backyard intelligence was sharing space with a reminder to buy oat milk. Still, it was the most detailed reconnaissance she had ever conducted on a six-year-old's birthday party, and she felt rightly proud of it. Bill leaned over her shoulder, squinting at the diagram. He was a large man who had spent twenty years in the criminal underworld, surviving bank vaults, alarmed jewellery stores, and one particularly harrowing incident involving a Doberman named Gerald. None of it had prepared him for this. "Is that a bouncy castle?" he asked. "Obstacle, not a feature. We go around it." "What about the clown?" "What clown?" "The one on the left, next to the paddling pool." Molly squinted. "That's a tree, Bill." "Are you absolutely certain?" "Yes." He pointed at a cluster of small circles near the back fence. "Are those the children?" "Those are the children." "How many?" "Seven confirmed. Nine if the cousins came." Bill straightened and pushed his woolly hat up his forehead, the universal signal that he was thinking hard and finding it deeply disagreeable. Beyond the van, the street was suburban and unremarkable, lined with oak trees and a pervasive atmosphere of people who owned bread makers. Number forty-two, the Hendersons' house, sat at the end of the cul-de-sac with pink balloons at the gate and the distant sound of children screaming in a way that could have been tremendous joy or genuine medical concern. "There'll be parents watching," Bill said. "There are always parents." "They'll be all over us." "They will be standing by the snack table discussing kitchen renovations and school performance. They will not notice a thing. They never do." Bill considered this and found it troubling chiefly because it was entirely true. He had grown up at parties exactly like this one, invisible to every adult in the room while committing small but significant acts of furniture vandalism. The plan was simple enough. Molly would slip through the back gate during pass-the-parcel, counting on the resulting chaos as cover. Bill would follow with the duffel bag on her signal. "What's the signal again?" Bill asked. "Thumbs up." "Right." He paused. "And if the signal goes wrong?" "It's a thumbs up, Bill. Nothing about it can go wrong." He seemed to find this only partially reassuring. A slow whistle escaped him as Molly reviewed her notes. Fourteen presents confirmed from social media, sixteen possible if grandparents had been individually generous. Somewhere in that gleaming pile was a first-edition collectible action figure worth four thousand pounds, purchased by the Hendersons quite accidentally at a car boot sale from a man who had absolutely no idea what he was selling. Bill pulled his hat back down. "Right. Let's go steal from children." Molly folded the map. "We call it redistribution of assets." "Do we?" "We do now."
created_at
"2026-02-24T22:25:21.925Z"
2/24/2026, 10:25:21 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
534
moderation_latency
(null)
generation_time
220121
tokens_prompt
77
tokens_completion
8110
native_tokens_prompt
92
native_tokens_completion
20357
native_tokens_completion_images
(null)
native_tokens_reasoning
7288
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.305631
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01NN44vv4t3kQzn2PfgoY4cX"
is_byok
false
latency
506
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771971921-VGSbCKrGuU46yhepTNMH"
upstream_id
"msg_01NN44vv4t3kQzn2PfgoY4cX"
total_cost
0.305631
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
510
27.69%
Dialogue to Total Word Ratio
Ratio: 35.07%, Deviation: 5.07%
neededClean
false
wordsTotal
519
wordsDialogue
182
59.0890%