NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ92YEH0AWTHNAMCTKA34HP
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
18.6s
Cost:
$0.010965
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
3.02%
Dialogue to Total Word Ratio
(details)
46.7508%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the rusted van and smoothed out the creases with her palm. The crayon drawings of balloons and a lopsided birthday cake in the corner were not her finest cartographic work, but they got the point across. "So the presents are stacked against the back wall of the garage," she said, tapping a red X with her finger. Bill squinted at the map. "Why is there a drawing of a clown on here?" "For atmosphere." "It's unsettling." "Bill, focus." He leaned closer, pushing his oversized novelty sunglasses up his nose. They were part of his disguise, which he had assembled entirely from a gas station gift rack. "Okay, okay. So we go in through the side gate, cross the yard, and grab the presents before anyone notices. That's the whole plan?" "That's the elegant simplicity of the plan, yes." Bill straightened up and crossed his arms. "And the kids?" "Distracted by the bouncy castle." "And the parents?" "Distracted by the free wine." "And the dog?" Molly paused. "What dog?" "The enormous golden retriever I saw through the fence when I did the scouting run this morning." "You did a scouting run?" "I walked past slowly and looked over the fence." "That's just looking over a fence, Bill." "In context it was reconnaissance." She folded her arms and stared at him for a long moment. A balloon drifted over the fence from the neighboring yard and bobbed between them before floating away into the sky. They both watched it go. "The dog will be fine," Molly said eventually. "It was enormous. I'm talking basically a small horse with opinions." "Dogs love me." "You said that about the Hendersons' terrier and it destroyed your left shoe." "That was a personal disagreement. This will be different." She rolled up the map and tucked it into her jacket. "Now, the timing window is tight. The magician performs at two o'clock and he runs exactly twenty-two minutes. I called his booking agency to confirm." Bill stared at her. "You called his booking agency." "Professionalism costs nothing." "It costs the dignity of whoever has to make that phone call." "During those twenty-two minutes," she continued, ignoring him entirely, "every adult eye in that garden will be on a man pulling scarves out of his own ear. We load the presents into the van, we drive away, and by three o'clock we're splitting the haul at my kitchen table." Bill looked at the fence. He looked at the van. He looked at Molly with the particular expression of a man who had made poor decisions before and recognized the early signs of another one approaching at speed. "These are children's birthday presents," he said. "Valuable children's birthday presents." "That does not improve the sentence." Molly smiled and pulled on her gloves. "It will when we see what's inside them. Now stop talking and open the gate."
created_at
"2026-02-18T02:26:12.740419+00:00"
2/18/2026, 2:26:12 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
774
moderation_latency
(null)
generation_time
18532
tokens_prompt
77
tokens_completion
766
native_tokens_prompt
75
native_tokens_completion
716
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.010965
router
(null)
provider_responses
0
id
"msg_01BNWcpCNB3n42i5Vqq11T9d"
status
200
is_byok
false
latency
774
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381553-VvnPwIB2ofPeJV6UYcdX"
upstream_id
"msg_01BNWcpCNB3n42i5Vqq11T9d"
total_cost
0.010965
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
490
3.02%
Dialogue to Total Word Ratio
Ratio: 58.37%, Deviation: 8.37%
neededClean
false
wordsTotal
490
wordsDialogue
286
46.7508%