NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8VHAQTSQD1Q8E0BMAKN5CW
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
20.9s
Cost:
$0.012741
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
2.13%
Dialogue to Total Word Ratio
(details)
38.6413%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the hood of the rusty van and smoothed out its corners with both palms. The map had been sourced from the birthday party's own invitation, which she had intercepted from a neighbor's recycling bin. It featured a drawing of a smiling sun and the words "Timmy's BIG Day" written in purple glitter glue. "The presents will be in the living room," Molly said, tapping the map with one finger. "Parents always pile them by the couch." Bill squinted at the illustration. "Is that a unicorn or the floor plan?" "It's both. The unicorn marks the back entrance." "Why would a floor plan have a unicorn." "Because Timmy drew it, Bill." Bill pulled his black beanie down over his ears even though it was seventy degrees outside. He considered himself a professional, and professionals wore beanies. He studied the invitation more carefully, noting the start time of two o'clock and the promise of a bouncy castle. "There's going to be a bouncy castle," he said gravely. "I know." "Molly, bouncy castles make noise. The kids will be on the bouncy castle, which is outside, which means the presents will be unguarded inside." Molly pointed at him. "Now you're thinking." "I've always been thinking. You just never notice because you're always talking." She ignored this and pulled out a second document, which was a hand-drawn diagram of the neighborhood showing the van's parking position, the optimal extraction route, and a small doodle of a cupcake she had added absent-mindedly during the planning phase. "We park on Elm, go through the side gate, in through the back door, grab the presents, and we're gone in four minutes." "What if the back door is locked?" "It won't be locked. Parents never lock anything during a party. They're too busy stopping children from eating decorative candles." Bill considered this wisdom and found it sound. He had attended enough birthday parties in his youth to understand the particular chaos that descended upon households when twenty small people arrived simultaneously demanding cake and validation. "What about the clown?" he asked. Molly froze. "What clown?" "The invitation says there's a clown. Rascal the Amazing, apparently." "You never mentioned a clown." "You had the invitation." "I was focused on the unicorn." Bill leaned against the van and crossed his arms with the satisfaction of a man who had finally said something useful in a planning session. Clowns were wildcards. Clowns moved unpredictably between indoor and outdoor spaces. Clowns noticed things because their whole existence depended on reading a room. "We'll have to neutralize Rascal," Molly said quietly. "I am not neutralizing a clown, Molly." "I mean distract him." "How." Molly looked across at the balloon animals visible through the neighbor's fence and slowly smiled the smile of someone whose plan had just become considerably more complicated and therefore considerably more interesting. "Do you still know how to make a poodle?"
created_at
"2026-02-24T22:15:24.676Z"
2/24/2026, 10:15:24 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
730
moderation_latency
(null)
generation_time
20782
tokens_prompt
77
tokens_completion
897
native_tokens_prompt
92
native_tokens_completion
831
native_tokens_completion_images
(null)
native_tokens_reasoning
99
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.012741
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01NnDcwkuNoEvfN5kMmKSqsB"
is_byok
false
latency
730
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771971324-MfhI1c0plLWObFhf7drj"
upstream_id
"msg_01NnDcwkuNoEvfN5kMmKSqsB"
total_cost
0.012741
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
487
2.13%
Dialogue to Total Word Ratio
Ratio: 41.22%, Deviation: 8.78%
neededClean
false
wordsTotal
490
wordsDialogue
202
38.6413%