NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ9153CV41552HM4TV1MS5W
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
16.0s
Cost:
$0.011730
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
20.19%
Matches word count
(details)
83.88%
Dialogue to Total Word Ratio
(details)
52.0349%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The alley behind Sunshine Party Rentals smelled like old balloons and poor decisions. Bill crouched behind a dumpster, holding a hand-drawn map covered in crayon because that was the only paper he could find. Molly stood over him, arms crossed, deeply regretting every choice that had led to this moment. "Okay, walk me through it again," Molly said. "Right. So the birthday party starts at two. We go in at three, when the cake comes out. Everyone looks at the cake. Nobody looks at the presents." "And you think seven-year-olds are sophisticated enough to be distracted by cake?" "Molly, adults are distracted by cake. Seven-year-olds basically worship cake. Cake is their religion." "That is the smartest thing you have ever said, and I find that genuinely upsetting." Bill smoothed out the crayon map with considerable pride. "I drew the bouncy castle here." "Why is the bouncy castle relevant?" "It's not. I just thought it looked cool." Molly snatched the map and studied it. A purple scribble dominated the center of what appeared to be a rectangle. She turned it sideways. Then upside down. "Bill, which way is north?" "It's a birthday party, not a military operation." "Every operation needs a north." "Fine, the bouncy castle is north. Happy?" "Ecstatic." She handed it back. "What about the dog? You said there was a dog." "Biscuit. Golden retriever. Extremely friendly, total liability." "Meaning what?" "Meaning if Biscuit sees us he will follow us everywhere and also probably lick the presents." "Can we use Biscuit as a distraction?" Bill's eyes went wide. "That is genuinely brilliant." "I have my moments." "We throw Biscuit a tennis ball toward the bouncy castle, kids chase Biscuit, Biscuit causes chaos, we grab the presents and walk calmly to the van." "The van that has 'Totally Normal Florist' written on the side." "People don't question flowers, Molly." "People absolutely question flowers arriving at a child's birthday party with no flowers inside." "We could put one flower in there." "One flower." "A nice one. Something cheerful. A sunflower, maybe." Molly stared at him for a long moment. A pigeon landed nearby, looked at them both, and flew away as if it had better options. "Let me just confirm," Molly said slowly, "that we are stealing presents from a seven-year-old named, according to your crayon map, Princess Butterscotch." "That's what it says on the banner." "That may not be her legal name, Bill." "Legally she is Emma Henderson of 14 Primrose Close, but spiritually she is Princess Butterscotch and you have to respect the branding." "I don't think I have to respect anything about this situation." "Are you in or not?" Molly looked at the map, at the van, at the single sunflower Bill had produced from somewhere inside his jacket like a magician with terrible priorities. "Get the tennis ball," she said.
created_at
"2026-02-18T02:25:11.614576+00:00"
2/18/2026, 2:25:11 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
718
moderation_latency
140
generation_time
15585
tokens_prompt
77
tokens_completion
776
native_tokens_prompt
75
native_tokens_completion
767
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01173
router
(null)
provider_responses
0
id
"f28aea1a-4967-4f34-87d6-5190b3abafc9"
status
200
is_byok
false
latency
718
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381495-QsUNC89rQJQSPv5ZmJum"
upstream_id
"f28aea1a-4967-4f34-87d6-5190b3abafc9"
total_cost
0.01173
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
20.19%
Matches word count
n/a
neededClean
false
words
480
83.88%
Dialogue to Total Word Ratio
Ratio: 68.13%, Deviation: 1.87%
neededClean
false
wordsTotal
480
wordsDialogue
327
52.0349%