NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8V83X3WTJ5N70TCVB0NQ6G
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
21.0s
Cost:
$0.012786
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
1.30%
Dialogue to Total Word Ratio
(details)
0.6901%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon drawing across the hood of the van and smoothed out the wrinkles with her palm. It was a floor plan, technically. If you were being generous. "Where did you get this?" Bill asked. "I drew it from memory. I was at the birthday party last year, remember? The one with the moon bounce." "You drew it in crayon." "I didn't have a pen." Bill leaned closer and squinted. "What's this purple blob?" "That's the bouncy castle." "And this orange squiggle?" "That's Derek. The birthday dad. He wears an orange vest." "You drew a person on the floor plan." "He barely moves, Bill. He's basically furniture." Bill straightened up and crossed his arms. He looked at the van, then at the house down the street, then at Molly. "Walk me through the plan." "Simple. The party starts at two. By two-thirty, every adult in that house is going to be distracted watching eleven children destroy a piñata shaped like a dinosaur." "How do you know it's dinosaur-shaped?" "I saw the invitation. It said bring your biggest whacking arm, colon, dinosaur destruction time." Molly paused. "There was a lot going on with that invitation." "And during the piñata chaos we go in through the back." "Through the back," Molly confirmed, pointing at a green blob on the drawing. "The presents are piled in the living room. Stack by the window. We've got maybe four minutes." "What if someone sees us?" "We wear the hats." Bill stared at her. "What hats." Molly reached into the back seat and produced two pointed party hats, the cone kind with elastic chin straps and little foil stars on them. She held one out to him. "These are party hats," Bill said. "Exactly. We'll blend right in." "Molly, we are two adults in a van. Party hats are not going to make us blend in." "They'll make us blend in more than no party hats." Bill considered this for a long moment. A dog barked somewhere down the street. A child's laugh floated over the neighborhood like a tiny warning. "I want the record to show," Bill said, taking the hat, "that I voiced a concern." "Noted." Molly stretched the elastic under her chin and snapped it. "The record also shows you're wearing the hat." "What's in it for us anyway? These are kids' presents. We're going to end up with a Lego set and a gift card to somewhere that sells fruit snacks." "The Hendersons' kid is turning seven, Bill. You know who the Hendersons are?" Bill slowly turned to look at her. "The Hendersons," Molly said, letting the name hang in the air like a piñata waiting to be struck, "are absolutely obscene." Bill looked back at the house. He adjusted his party hat. "Four minutes," he said. "Four minutes," she agreed.
created_at
"2026-02-24T22:10:22.764Z"
2/24/2026, 10:10:22 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
901
moderation_latency
(null)
generation_time
20902
tokens_prompt
77
tokens_completion
833
native_tokens_prompt
92
native_tokens_completion
834
native_tokens_completion_images
(null)
native_tokens_reasoning
107
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.012786
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_0136n95ZRAmedLUhdTQynmZ8"
is_byok
false
latency
899
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771971022-wrL02HzXj2CLIJJtftKk"
upstream_id
"msg_0136n95ZRAmedLUhdTQynmZ8"
total_cost
0.012786
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
471
1.30%
Dialogue to Total Word Ratio
Ratio: 60.68%, Deviation: 9.32%
neededClean
false
wordsTotal
473
wordsDialogue
287
0.6901%