NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8TMRSW3MK9YGFYNB0Q42QS
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4m 23s
Cost:
$0.427521
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
96.11%
Dialogue to Total Word Ratio
(details)
98.0538%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of their rusted van, weighing the corners down with a juice box and a novelty eraser shaped like a dinosaur that she had absolutely not stolen from a child. The map detailed the full layout of 14 Birchwood Lane, home of Gerald Pemberton, age seven, host of what the neighborhood was calling the most aggressively themed dinosaur birthday party the cul-de-sac had ever witnessed. "They've got a bouncy castle on the east side," Molly said, tapping the map. "That's our primary distraction." Bill squinted at the drawing. He had the fixed, hopeful expression of a man who had once gotten lost in a parking garage for forty-five minutes and never quite recovered from the experience. "Why would a bouncy castle be a distraction? Children bounce constantly. That is fundamentally what they do." "All the kids bounce outside, Bill. We slip through the side gate, grab the presents from the living room, and we're gone before anybody cuts the cake." "That's the whole plan?" "That's the beautiful simplicity of the plan, yes." "It sounds like you made the plan this morning." "I made the plan with great care and you'd know that if you'd come to the planning meeting." "You held it in the van while I was asleep." Molly chose not to address this. She had recruited Bill six months ago from a bingo hall because he owned a van and had what she generously described as a flexible moral compass. The van smelled of old French fries and something adjacent to optimism. The reconnaissance had taken three days. Molly had gone undercover as a balloon delivery person, an extremely convincing cover until twelve toddlers ambushed her and commandeered all sixteen balloons before she reached the front door. She retreated in quiet defeat, trailing one balloon that read Happy Retirement, grabbed from entirely the wrong box. Bill's contribution to the operation had been eating a sandwich in the van. "What about the dog?" Bill asked, frowning at the corner of the map where Molly had written DANGER DOG in capital letters, ringed with jagged lines to indicate menace. "Cockapoo. Eight pounds. Named Sprinkles." "Sprinkles bit three people according to the Nextdoor app." "I am aware of Sprinkles' record." "Are you worried about Sprinkles?" "I have decided not to worry about Sprinkles." "You can't decide that." "Watch me." Molly rolled up the map with crisp efficiency, zipped her jacket, and adjusted the small backpack she had selected because it looked tactical and not at all because it was the only one on clearance. She turned to face Bill with the steady confidence of someone who had fully thought this through and was not under any circumstances going to revisit that conclusion. "Sprinkles doesn't scare me," she said. Bill looked at her for a quiet moment. Down the street, a child shrieked with delight, and an off-key Happy Birthday drifted through the afternoon air like a small, cheerful warning. "She really should," he said.
created_at
"2026-02-24T21:59:48.803Z"
2/24/2026, 9:59:48 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
878
moderation_latency
(null)
generation_time
263109
tokens_prompt
77
tokens_completion
9521
native_tokens_prompt
92
native_tokens_completion
28483
native_tokens_completion_images
(null)
native_tokens_reasoning
8734
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.427521
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01LxZ3YmjLDHZr1U4Gc5QPJm"
is_byok
false
latency
840
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771970388-pPQUj582abThSHmtzeIr"
upstream_id
"msg_01LxZ3YmjLDHZr1U4Gc5QPJm"
total_cost
0.427521
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
96.11%
Dialogue to Total Word Ratio
Ratio: 30.89%, Deviation: 0.89%
neededClean
false
wordsTotal
505
wordsDialogue
156
98.0538%