NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8WZGQH617VHGDCRJDVF6DT
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18.9s
Cost:
$0.013176
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
14.30%
Matches word count
(details)
93.16%
Dialogue to Total Word Ratio
(details)
53.7330%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crudely drawn map across the hood of their rusting van and stabbed her finger at a pink scrawled rectangle. "The presents will be here. In the living room. Right next to the bouncy castle." Bill squinted at the map. "Is that a bouncy castle or a lumpy rectangle?" "It's a bouncy castle, Bill." "It looks like a lumpy rectangle." "Everything I draw looks like a lumpy rectangle. Focus." Molly pulled out a laminated sheet covered in bullet points. Bill often wondered if her laminator was the most expensive thing she owned. "We go in at three fifteen. That's peak musical chairs. Every kid will be distracted." "What about adults?" "There are four parents. Two are helping with the pass the parcel. One is in the kitchen crying. Classic party stress. The fourth one is Dave." "Who's Dave?" "Doesn't matter. Dave is never a problem." Bill scratched his chin. "What if the presents are heavy? My back has been acting up since the warehouse job." "That was a bouncy castle too, Bill." "That was a trampoline park." "Same energy." Molly tapped the map again. "We use the wheelie bin from next door. We load up, roll to the van, we're gone before anyone sings happy birthday." "What if they sing it early?" "Children's parties don't sing it early. There's an order to these things. There's a structure. It's one of the last remaining structures in civilized society." Bill looked genuinely moved. "That's beautiful." "Thank you." Molly cleared her throat. "Now. Disguises. I'm going as a face painter." "You can't paint faces." "I can paint one face." "Which face?" "A cat." "You can only paint a cat face?" "Everyone loves a cat face, Bill. Nobody's going to complain about a cat face." Bill pointed at himself. "And me?" Molly hesitated just long enough for him to feel insulted. "You're the balloon animal guy." "I can't make balloon animals either." "You can make one." Bill closed his eyes. "A sword." "A sword. Exactly. Kids love swords." Molly rolled up the map with the practiced efficiency of someone who had done this far too many times. "In and out in twelve minutes. We take the presents, we fence them at Gary's, we split the profit, and we disappear into the long afternoon." "This feels wrong," Bill said. "Heists always feel wrong at the beginning. That's how you know it's a heist." "We're stealing from a seven year old's birthday party." "We're redistributing gifts from a demographic with no income and zero appreciation for the resale market." Molly opened the van door and climbed in. "The kid has parents. The kid has a bouncy castle. The kid has Dave." "What IS Dave?" "Get in the van, Bill." Bill took one last look at the suburban street ahead, a place of cheerful balloons and distant shrieking laughter, and got in the van.
created_at
"2026-02-24T22:40:38.137Z"
2/24/2026, 10:40:38 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1088
moderation_latency
203
generation_time
18621
tokens_prompt
77
tokens_completion
867
native_tokens_prompt
92
native_tokens_completion
860
native_tokens_completion_images
(null)
native_tokens_reasoning
113
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.013176
router
(null)
provider_responses
0
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
id
"9df3c46f-5c22-44b2-94eb-c1104dd9b485"
is_byok
false
latency
1088
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771972838-LU61uNkCJpfz2rqiPJrn"
upstream_id
"9df3c46f-5c22-44b2-94eb-c1104dd9b485"
total_cost
0.013176
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
14.30%
Matches word count
n/a
neededClean
false
words
479
93.16%
Dialogue to Total Word Ratio
Ratio: 71.19%, Deviation: 1.19%
neededClean
false
wordsTotal
479
wordsDialogue
341
53.7330%