NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8WZ0C2CMFZBH7QJN53M39T
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
20.9s
Cost:
$0.012771
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
0.73%
Dialogue to Total Word Ratio
(details)
45.6071%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the rusting van and smoothed out the corners with her palms. The map depicted a cul-de-sac, a house with a bounce castle drawn in the backyard, and seventeen X marks where she suspected the presents would be located. "I counted at least forty kids going in," she said, tapping the paper. "That's forty presents minimum. Could be more if the parents overcompensate, which, based on the size of that bouncy castle, they absolutely do." Bill squinted at the map and then at the actual house twenty yards away. Balloons bobbed above the fence. Somewhere beyond it, a child was screaming with what could have been either delight or genuine distress. "You drew the bounce castle way too big on this map," Bill said. "It's not to scale, Bill." "Nothing on here is to scale. You've drawn yourself taller than the house." "That's irrelevant." Molly snatched the map back and folded it badly. "Here's the plan. The party runs until four. At three forty-five, the cake comes out. Every single person in that backyard will be staring at the cake. Parents, kids, the clown, everyone." "There's a clown?" "There's a clown." Bill went quiet for a moment, processing this with visible discomfort. "I don't like this job anymore." "You didn't like it before either, so nothing has changed. While the cake has everyone hypnotized, we go in through the side gate, load the presents into the van, and we're gone before anyone sings the second verse of Happy Birthday." "People sing the second verse?" "Some families do. Weird families. Families exactly like this one." Molly pointed firmly at the house. "Focus. The presents are piled on the porch. I saw them when little Emma arrived with what appeared to be a gift bag containing an entire toy kitchen." Bill rubbed his chin thoughtfully. "Taking toys from children on their birthday is genuinely quite low, Molly. I want to say that out loud before we proceed." "We are returning them tomorrow." "Are we?" "Probably. The point is we need the insurance photographs for the claim, and we cannot submit a claim for presents we no longer have without first having had them." Bill stared at her. "That is the most complicated logic you have ever used." "Thank you." "It wasn't a compliment." A horn sounded from inside the property and the screaming intensified. Both thieves ducked instinctively behind the van, hearts hammering. Then the screaming dissolved into laughter, and a pop song about friendship began playing at a volume that suggested the parents had already started drinking. Molly checked her watch. "Forty minutes. We synchronize now." Bill looked at his bare wrist. "I forgot my watch." "Bill." "I overslept." "Bill." "The alarm didn't go off because I forgot to set it because I was nervous about the clown." Molly breathed slowly through her nose and began counting to ten. She only needed seven.
created_at
"2026-02-24T22:40:21.387Z"
2/24/2026, 10:40:21 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
495
moderation_latency
(null)
generation_time
20885
tokens_prompt
77
tokens_completion
875
native_tokens_prompt
92
native_tokens_completion
833
native_tokens_completion_images
(null)
native_tokens_reasoning
104
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.012771
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01PFxH8hHDDa3Lqp4jTJoQp1"
is_byok
false
latency
475
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771972821-oHMF7kfR7v1O6G5dJMj2"
upstream_id
"msg_01PFxH8hHDDa3Lqp4jTJoQp1"
total_cost
0.012771
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
490
0.73%
Dialogue to Total Word Ratio
Ratio: 59.92%, Deviation: 9.92%
neededClean
false
wordsTotal
494
wordsDialogue
296
45.6071%