NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8V0CRSC0K11TA6YKQNZS1X
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3m 29s
Cost:
$0.304791
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
95.37%
Dialogue to Total Word Ratio
(details)
97.6750%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the rusted van, anchoring its corners with a juice box, a half-eaten granola bar, and a promotional stress ball shaped like a giraffe. The map had been rendered entirely in crayon, which was the only writing implement available in the glove compartment, a fact that spoke volumes about the current standing of their operation. Bill leaned over her shoulder and squinted at the diagram. He was wearing a dark turtleneck despite it being June, because he held a steadfast philosophy about turtlenecks and professionalism that no accumulation of contrary evidence had ever managed to shake. "The bounce house is our biggest obstacle," Molly said, tapping the pink squiggle on the map. "Kids cycle through it constantly. There is no pattern and no predictability whatsoever." "I have a solution for the bounce house," Bill said. "We cut around the left side through the hedges." "There are no hedges on that side." "What if we planted some beforehand? They make fast-growing varieties." Molly chose not to engage with this and moved on. The party at 14 Clover Lane was, by all available intelligence, a considerable production. Forty-seven children between the ages of four and nine, two professional entertainers, a well-reviewed face painter named Derek who was apparently devoted to his craft, and a gift table that their inside source had described as an absolute mountain of presents. Their inside source was Molly's nephew Gerald, who had supplied this intelligence in exchange for two dollars and the permanent retirement of a nickname he considered deeply unjust. "We go in during pin the tail on the donkey," Bill said, jabbing the map. "Every kid in that yard will be distracted." "They take turns with the blindfold, Bill. It is not a simultaneous group experience." "I know. I was speaking in broad terms." "Please do not speak in broad terms during an operational briefing." The presents would be stacked on a folding table against the back fence, which opened conveniently onto a quiet alley. This was the elegant heart of Molly's plan, a design she had refined over three consecutive evenings while ignoring her landlord's increasingly baroque threats. They would park in the alley, scale the fence using the collapsible step stool that Bill had formally rechristened the tactical ladder, load the presents into matching duffel bags, and vanish before anyone finished singing happy birthday. It was clean. It was clever. It was also being executed by two people whose last successful heist had involved correctly guessing a neighbor's wifi password. "What do we tell a child who spots us?" Bill asked. "We're from the North Pole. Present audit." "That falls apart instantly. It's June." "Children will believe almost anything delivered with enough confidence." She folded up the map and tucked it away. "That is essentially why they make such appealing targets." Bill absorbed this, found it both logical and slightly troubling, squeezed the giraffe stress ball twice for luck, and declared himself ready.
created_at
"2026-02-24T22:06:09.701Z"
2/24/2026, 10:06:09 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1192
moderation_latency
(null)
generation_time
208664
tokens_prompt
77
tokens_completion
8392
native_tokens_prompt
92
native_tokens_completion
20301
native_tokens_completion_images
(null)
native_tokens_reasoning
7593
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.304791
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01MQo8vj14aw8yqV6P8h44fT"
is_byok
false
latency
1187
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771970769-huOyyurgrqPDrKVcvxpT"
upstream_id
"msg_01MQo8vj14aw8yqV6P8h44fT"
total_cost
0.304791
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
498
95.37%
Dialogue to Total Word Ratio
Ratio: 29.03%, Deviation: 0.97%
neededClean
false
wordsTotal
503
wordsDialogue
146
97.6750%