NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8WKJVQPCERW75HR9K3QPBX
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
17.6s
Cost:
$0.012666
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
35.33%
Dialogue to Total Word Ratio
(details)
17.6671%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unrolled the crayon-drawn map across the hood of the van and stabbed her finger at a lopsided rectangle labeled "BOUNCY CASTLE." "Entry point," she said. Bill squinted at the map. "Is that a drawing of a dog or the security guard?" "That's the birthday boy. Focus." "He has a tail." "He has a cape. It's a superhero party, Bill. Keep up." Molly pulled her black turtleneck up over her chin even though it was eighty-two degrees outside. "Now, the presents are stacked in the northeast corner of the backyard. We go in during the piñata segment. Maximum chaos, minimum supervision." Bill nodded slowly. "What are we talking about? Loot-wise." "Three LEGO sets, two remote-control cars, what I'm told is a very premium art kit, and a stuffed giraffe so large it has its own zip code." "I don't want the giraffe." "Nobody wants the giraffe, Bill. We take the giraffe anyway. We leave the giraffe, someone triangulates our whole operation." Bill leaned against the van and crossed his arms. "And the cake? Are we taking the cake?" Molly stared at him for a long moment. "Why would we take the cake?" "It smells incredible from here." "We are not taking the cake. We are professionals." She paused. "We might take two slices. But that is strictly off the books." "What about the party favors?" "What about them?" "Little bags. Candy. Tiny plastic toys. Could add up." "Could you hear yourself right now?" Molly rolled up the map. "We are not looting the favor bags from a seven-year-old's birthday party." "We're stealing his presents." "That is entirely different and I don't have time to explain the moral architecture of this operation to you right now." Bill picked up his clipboard, which had a cartoon balloon sticker on it left over from a previous job. "What's my cover identity?" "You're an uncle." "Which uncle?" "A vague one. Everybody has a vague uncle. You came from out of town. You can't remember how you're related. You brought no gift because you forgot and you feel terrible." "That's very believable, actually." "I know. I based it on you." Molly checked her watch. "Piñata goes up in eleven minutes. We park on Sycamore, we blend in near the folding table, and when the blindfold goes on that kid, we move fast and we move quiet." "And if someone spots us?" "You cry." "I'm not going to cry at a child's birthday party, Molly." "Then you laugh. Loudly. Like one of those uncles." Bill considered this. "I can do that." "Good." Molly slid on a pair of enormous novelty sunglasses that somehow made her look more suspicious than before. "Let's go ruin a beautiful day."
created_at
"2026-02-24T22:34:07.104Z"
2/24/2026, 10:34:07 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
692
moderation_latency
223
generation_time
17272
tokens_prompt
77
tokens_completion
804
native_tokens_prompt
92
native_tokens_completion
826
native_tokens_completion_images
(null)
native_tokens_reasoning
101
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.012666
router
(null)
provider_responses
0
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
id
"c4bf5a6a-0ab1-4f27-abb1-c3c300f9910a"
is_byok
false
latency
689
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771972447-wvqCxh8dfDNtvofylPYe"
upstream_id
"c4bf5a6a-0ab1-4f27-abb1-c3c300f9910a"
total_cost
0.012666
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
450
35.33%
Dialogue to Total Word Ratio
Ratio: 74.56%, Deviation: 4.56%
neededClean
false
wordsTotal
456
wordsDialogue
340
17.6671%