NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ91ECA9QPDNZGN36V1X779
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
14.7s
Cost:
$0.010860
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
6.09%
Matches word count
(details)
2.92%
Dialogue to Total Word Ratio
(details)
4.5053%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the window of the van as Molly spread a hand-drawn map across the dashboard. Bill leaned in, squinting at what appeared to be a drawing done by someone who had never held a pencil before. "Okay, talk me through this," Bill said, pointing at a wobbly circle near the center of the page. "Is that the bouncy castle or a potato?" "That's the bouncy castle, obviously. See the little flag on top?" "That could be an antenna. This could be a satellite dish." "Bill, why would a six year old have a satellite dish at his birthday party?" "I don't know, maybe his parents are loaded." "His parents ARE loaded, that's the entire point of this operation." Bill sat back and folded his arms. "Right. Right. Sorry. Continue." Molly tapped the map with her finger. "So we go in through the garden gate at approximately two fifteen, which is when the magician starts his act." "Why the magician?" "Because every single child and adult in that garden will be watching a man in a cape try to pull a rabbit out of a hat. Nobody watches a magician and also looks at the gift table. It's physically impossible." "What if the magician is bad and people start wandering off?" "Then we hope the rabbit takes long enough to find." Bill rubbed his chin thoughtfully. "Okay. What are we wearing?" "I've got two sets of party entertainer uniforms. Face paint, big shoes, the works." "Absolutely not." "Bill." "I am not wearing clown shoes to a robbery." "They're not clown shoes, they're entertainer shoes, and they complete the disguise." "They're the same thing and you know it." "Would you rather wear a prison uniform in about four months?" Bill was quiet for a moment. "How big are the shoes?" "Enormous." "Fantastic." Molly rolled up the map and stuffed it into the glove box. "Now, the presents are piled on a table near the back fence. I counted at least forty boxes when I did the reconnaissance." "You went to the kid's garden?" "I told his mother I was from the council checking drainage." "That's brilliant, actually." "I know. Anyway, forty boxes, mostly unwrapped because the party hasn't started yet. We bag them up, we're over the fence, we're in the van, we're gone before anyone finishes singing happy birthday." "And then we what, sell them?" "We donate them." Bill stared at her. "We steal them and then donate them?" "To the children's hospital on Merchant Street. They've got nothing over there." "Molly." "What?" "We are stealing presents from a rich kid to give to sick kids." "Correct." "That is either the most noble or the most insane thing we have ever done." "Why can't it be both?" Bill looked out at the rain and smiled slowly. "You know what, hand me the clown shoes."
created_at
"2026-02-18T02:25:19.591458+00:00"
2/18/2026, 2:25:19 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
544
moderation_latency
129
generation_time
14470
tokens_prompt
77
tokens_completion
744
native_tokens_prompt
75
native_tokens_completion
709
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01086
router
(null)
provider_responses
0
id
"6cc13e7f-1716-45a7-864a-4d6b018f1293"
status
200
is_byok
false
latency
543
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381504-1XB9kWJOUgIih82gI4yX"
upstream_id
"6cc13e7f-1716-45a7-864a-4d6b018f1293"
total_cost
0.01086
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
6.09%
Matches word count
n/a
neededClean
false
words
477
2.92%
Dialogue to Total Word Ratio
Ratio: 78.41%, Deviation: 8.41%
neededClean
false
wordsTotal
477
wordsDialogue
374
4.5053%