NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8VFPK9P8ZHJFW8A1VAHJ2T
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
33.3s
Cost:
$0.033996
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
9.69%
Dialogue to Total Word Ratio
(details)
54.8393%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of their rusted van, smoothing out the crayon marks with her palm. The illustration was detailed, thorough, and had been completed entirely in purple crayon borrowed from a gas station parking lot. "The target," she said, tapping the paper with a gloved finger, "is the Henderson boy's seventh birthday party. The intel suggests a bouncy castle on the east flank and a clown working the perimeter." Bill squinted at the map. "Is that a drawing of a unicorn cake?" "It's a floor plan, Bill." "Right, right." He nodded slowly. "Why is the floor plan pink?" "Because purple ran out." She straightened up and crossed her arms. "Now focus. The presents are piled in the living room, unguarded, while the adults congregate by the punch bowl on the back patio." Bill scratched his chin thoughtfully. He had the kind of face that suggested deep thinking was expensive and he was watching the budget carefully. "What's our window?" "Pin the Tail on the Donkey. Four minutes, maybe five if they put it on a kid with good spatial awareness." "And the clown?" "Neutralized. We bring him a balloon animal request so complicated it buys us time. Something with seventeen segments." "A giraffe riding a motorcycle?" "That's only twelve. Think bigger." Bill's eyes widened with sudden inspiration. "A giraffe riding a motorcycle while juggling." "Now you're thinking like a criminal." Molly rolled up the map and tucked it under her arm. She surveyed the quiet suburban street with the practiced eye of someone who had cased at least four birthday parties this spring alone. The neighborhood smelled like cut grass and somebody's charcoal grill, which was either comforting or suspicious. She had not yet decided. "What about party bags?" Bill asked. "Do we take those too?" "Absolutely not. Party bags are for the guests. We have standards." "We're stealing gifts from a seven-year-old, Molly." "We're redistributing joy, Bill. There's a difference." She pulled two fake mustaches from her jacket pocket and handed him one. "Put this on. We're going in as relatives." Bill held up the mustache. "Whose relatives?" "Doesn't matter. Every party has an uncle nobody recognizes." He pressed the mustache above his lip and looked at her expectantly. She put on her own and they regarded each other with the solemn gravity of seasoned professionals. "How do I look?" he asked. "Suspicious." "Right, good." "No, Bill, that's bad." The bouncy castle inflated with a series of mechanical groans from somewhere behind the fence, and children's laughter erupted like a small joyful riot. Both thieves flinched instinctively. A dog three houses down began barking at nothing, which Molly took as either a bad omen or simply a dog being a dog. "You sure about this?" Bill asked quietly. Molly listened to the chaos beyond the garden wall and nodded with the calm certainty of someone who had absolutely no business being this confident. "I've never been more sure of anything." She was, of course, completely wrong.
created_at
"2026-02-24T22:14:31.28Z"
2/24/2026, 10:14:31 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
608
moderation_latency
188
generation_time
33020
tokens_prompt
77
tokens_completion
1676
native_tokens_prompt
92
native_tokens_completion
2248
native_tokens_completion_images
(null)
native_tokens_reasoning
866
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.033996
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01BmmpVr32dtBesxxGE59ZsP"
is_byok
false
latency
559
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771971271-1SksHMqBw9CFtsMq9cKs"
upstream_id
"msg_01BmmpVr32dtBesxxGE59ZsP"
total_cost
0.033996
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
9.69%
Dialogue to Total Word Ratio
Ratio: 43.17%, Deviation: 6.83%
neededClean
false
wordsTotal
505
wordsDialogue
218
54.8393%