NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8WYXK1QB4RWS5RKR5CZMBH
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
20.1s
Cost:
$0.012666
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
5.52%
Dialogue to Total Word Ratio
(details)
2.7596%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of their rusted van and tapped it twice with her finger. "The bouncy castle goes up at noon," she said. "That gives us a thirty-minute window before the kids arrive." Bill squinted at the map. "Is that a drawing of a clown?" "That's the security." "The security is a clown?" "His name is Pickles and he does not mess around." Molly pulled her collar up. "I've seen him make a balloon animal in under four seconds. The man has fast hands, Bill." "We're being guarded by a birthday clown named Pickles." "Do you want the presents or not?" Bill looked back at the map and scratched his chin. "Walk me through the entry point again." "We go over the garden fence here, cut across the lawn, and enter through the patio doors. The gifts will be stacked on the table in the living room, unsupervised, while everyone is outside watching the magic show." "There's a magic show too?" "A magician named Gerald. He's apparently terrible, which works in our favor. He'll hold the crowd's attention for at least twenty minutes because everyone will be too baffled to leave." Bill nodded slowly. "What's in these presents anyway? We don't even know." "Lego, probably. Maybe some kind of remote control car. The birthday kid is turning seven, Bill. Use your imagination." "I'm just saying, last time you promised big returns and we ended up with a labrador and a bread maker." "That was one job." "Molly, we still have the labrador." "Kevin is a good dog and this is not about Kevin." She rolled up the map firmly. "The point is, premium children's gifts move fast at the car boot sale. We're talking forty, maybe fifty items. Clean haul, easy exit." Bill looked genuinely troubled. "Aren't you even a little bit bothered that we're stealing from children?" Molly stared at him for a long moment. A pigeon landed nearby, looked at them both, and left. "They'll get vouchers from their nan," she said finally. "They'll be fine." "My heart is not fully in this one." "Your heart was fully in stealing that cheese wheel from the farmers market and look how that ended." "We ate very well for two weeks." "We smelled like a cave, Bill." He conceded this with a tilt of his head. Molly checked her watch and pulled on her black beanie hat. "We go in at twelve fifteen. You carry the duffel bag, I handle Pickles if it comes to that. We're out in under ten minutes." "And if a child sees us?" "Tell them you're a helper elf." "It's June." "Elves work year round. That's the whole point of elves." She climbed into the van and started the engine. "Now get in. Pickles won't distract himself."
created_at
"2026-02-24T22:40:18.538Z"
2/24/2026, 10:40:18 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
818
moderation_latency
(null)
generation_time
20069
tokens_prompt
77
tokens_completion
825
native_tokens_prompt
92
native_tokens_completion
826
native_tokens_completion_images
(null)
native_tokens_reasoning
118
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.012666
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01TFbxdyY2zX4AXLv8CrdMmL"
is_byok
false
latency
812
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771972818-etLdRz2tWV5lU6AqPTX2"
upstream_id
"msg_01TFbxdyY2zX4AXLv8CrdMmL"
total_cost
0.012666
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
467
5.52%
Dialogue to Total Word Ratio
Ratio: 77.61%, Deviation: 7.61%
neededClean
false
wordsTotal
469
wordsDialogue
364
2.7596%