NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ91D69X67CVP3FH6BXX98A
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
17.6s
Cost:
$0.010830
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
40.6371%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like cheese crackers and regret. Molly spread the hand-drawn map across the dashboard while Bill ate the last of his sandwich and got mustard on the gearshift again. "Okay, so the party starts at two," Molly said, tapping the map with her pen. "We go in through the back gate at three, grab the presents, and we're out before the cake cutting." "Why before the cake cutting?" Bill asked. "Because if there's cake, you'll eat it. You ate the cake at the Henderson job." "That was a retirement party and they had fondant roses, Molly. Fondant roses." "Focus. The presents are going to be stacked near the bouncy castle according to my source." "Your source is a seven year old." "My source is highly motivated and got twelve dollars for the intel." Bill crumpled his sandwich wrapper and stuffed it under the seat where he kept his collection of crumpled sandwich wrappers. "How many kids are we talking about?" "Fourteen, maybe fifteen." "And how old?" "Six and under." "Good," Bill said, nodding seriously. "Six and under can't give a reliable description to police. Their sense of stranger danger is high but their attention spans are low." "That is either very smart or very terrible." "With me it's usually both." Molly folded the map and tucked it into her vest pocket, the one next to the pocket where she kept her lock picks and a single wrapped butterscotch candy for emergencies. "The birthday girl is turning five. Her name is Sophie. The theme is unicorns." "So there's going to be purple everywhere." "So much purple. We wear the purple." Bill looked at her. "We wear the purple." "We blend in. I got us both purple shirts." "Molly, I am a forty-three year old man with a neck tattoo that says 'BILL' in gothic font." "Which is why you also wear this." She pulled a purple paper party hat from the back seat and handed it to him with tremendous dignity. Bill held it at arm's length like it had personally insulted him. "Absolutely not." "Absolutely yes. You look like a dad. Dads wear hats at kids' parties. It's science." "I look like a dad who has made catastrophically poor life choices." "Bill. You are currently planning to rob a child's birthday party." He put the hat on. "Fair point." Molly checked her watch and looked toward the house at the end of the cul-de-sac, already loud with the sound of small people having large amounts of fun. A banner over the front door read HAPPY BIRTHDAY SOPHIE in glittery letters. "You feel good about this?" Bill asked. "I feel genuinely terrible about this," Molly said cheerfully, opening the van door. "Let's go." Bill adjusted his party hat in the side mirror and followed her out into the afternoon sunshine, already hearing the bouncy castle. He was definitely going to eat the cake.
created_at
"2026-02-18T02:25:21.280648+00:00"
2/18/2026, 2:25:21 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
464
moderation_latency
(null)
generation_time
17574
tokens_prompt
77
tokens_completion
745
native_tokens_prompt
75
native_tokens_completion
707
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01083
router
(null)
provider_responses
0
id
"msg_01DqQfckEe5vD5NBZ6NipM6d"
status
200
is_byok
false
latency
464
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381503-hCdGVl0Cje6eGmii32fq"
upstream_id
"msg_01DqQfckEe5vD5NBZ6NipM6d"
total_cost
0.01083
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
488
0.00%
Dialogue to Total Word Ratio
Ratio: 55.12%, Deviation: 14.88%
neededClean
false
wordsTotal
488
wordsDialogue
269
40.6371%