NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ948K08ARVQEKH0Q68ARE4
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
17.5s
Cost:
$0.011205
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
4.21%
Dialogue to Total Word Ratio
(details)
47.3475%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the windshield of Bill's rusty van as he spread the hand-drawn map across the steering wheel. Molly sat in the passenger seat eating a bag of cheese puffs, leaving radioactive orange fingerprints on everything she touched. "Okay, so the party starts at two," Bill said, tapping the map with a highlighter he'd already accidentally uncapped with his mouth, leaving a yellow streak across his lower lip. "We go in through the back garden, grab the goods, and we're out before they finish singing happy birthday." "Have you considered," Molly said slowly, crunching loudly, "that we are adults." "Obviously I've considered that, Molly." "Adults who are stealing from a six year old's birthday party." "She's turning seven, actually. I did my research." Molly stared at him for a long moment. A cheese puff fell from her fingers onto her lap. "That makes it worse, Bill. That genuinely makes it considerably worse." Bill waved his hand dismissively and returned to the map. He had drawn the neighboring house slightly too large and accidentally made it look like a medieval castle, which he felt added a certain dramatic flair. The birthday girl's house was represented by a small rectangle with a flag on top, which he also felt was appropriate. "The presents will be stacked near the gift table. Standard formation," he said. "Standard formation," Molly repeated flatly. "You think six year olds have a standard formation." "Seven year olds." "Bill." "The intelligence reports suggest the presents will be considerable this year," he continued, completely ignoring her expression. "Her grandmother is flying in from Portland. Portland grandmothers are notoriously generous." Molly finished the bag of cheese puffs and crumpled it loudly. She looked out at the residential street, watching a squirrel make objectively better life choices than she currently was. "What exactly are we planning to do with a bunch of children's toys?" she asked. "Sell them." "To whom?" "Other children." "Other children don't have money, Bill." He paused. This was, he had to admit, a flaw in the business model he had not fully addressed during the planning phase. He smoothed the map unnecessarily. "Their parents then," he said. "So your plan is to steal presents from one child and sell them to the parents of other children." "When you say it like that it sounds chaotic." "That's because it is chaotic!" Molly said, finally raising her voice enough to startle a pigeon from the telephone wire above them. "It's completely, spectacularly chaotic!" Bill rolled up his map with considerable dignity. He had spent four hours on that map and he wasn't going to let Molly's reasonable objections diminish his pride in it. "Are you in or not?" he asked. Molly looked at the house. She looked at Bill. She looked at her orange-stained fingers. "I want thirty percent," she said. "Done." "And you're buying me more cheese puffs." "Absolutely not." "Twenty-five percent then." "Deal."
created_at
"2026-02-18T02:26:54.587471+00:00"
2/18/2026, 2:26:54 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
400
moderation_latency
(null)
generation_time
17466
tokens_prompt
77
tokens_completion
802
native_tokens_prompt
75
native_tokens_completion
732
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.011205
router
(null)
provider_responses
0
id
"msg_011UX8tmhivnVzZaEmHK8xBn"
status
200
is_byok
false
latency
400
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381596-46TSuChUgWpG8NmRQWJ0"
upstream_id
"msg_011UX8tmhivnVzZaEmHK8xBn"
total_cost
0.011205
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
490
4.21%
Dialogue to Total Word Ratio
Ratio: 42.04%, Deviation: 7.96%
neededClean
false
wordsTotal
490
wordsDialogue
206
47.3475%