NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE8X7C4NJVAYZTTKTTK06KA
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18.3s
Cost:
$0.011936
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
27.17%
Matches word count
(details)
0.93%
Dialogue to Total Word Ratio
(details)
14.0483%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back of the van smelled like old pizza and desperation. Molly spread the blueprint across her knees—a crayon drawing of a house, stolen from the target's recycling bin. "Bill, this is a five-year-old's drawing of his own house." "Best intel we could get. Kid's got real talent, actually. Look how he did the chimney." "There's a stick figure labeled 'Daddy has a gun.'" "Occupational hazard. We adapt." Molly rubbed her temples. "Remind me why we're robbing a seven-year-old's birthday party." "Because, Moll, nobody suspects the presents at a kid's party. You know how much loot is in there? Grandma money. Uncle guilt money. That weird neighbor who overcompensates because he never has kids over money." "It's a Paw Patrol themed party." "Exactly! Who steals from Paw Patrol? Nobody. That's our cover." "Our cover is a dog cartoon." "Our cover is *innocence*, Molly. Nobody expects a heist at a bouncy castle." Molly tapped the map. "Fine. Entry point?" "Through the side gate, past the piñata station, into the gift table zone. In and out in four minutes." "There's a clown, Bill." "I've handled clowns." "You cried at a clown once. In Reno." "That clown had a knife, Molly. A *real* one. This is a birthday clown. Different breed." "They're all the same breed. They multiply in the dark." Bill ignored her, unrolling a second sheet—an actual napkin with a getaway route sketched in ketchup. "We go in disguised as party entertainment. I already ordered a costume." "What costume?" "Chuck E. Cheese." "That's copyrighted, Bill." "It's *inspired by*. Slightly different ears." Molly closed her eyes. "And what am I supposed to be?" "Balloon artist." "I don't know how to make balloon animals." "Nobody checks, Molly. You just twist and look confident. Confidence is ninety percent of crime." "What's the other ten percent?" "Escape routes and a getaway car that doesn't smell like a locker room." "That's the van's natural smell now. It's part of its personality." Bill grinned, holding up a rubber chicken. "Distraction device." "That's a rubber chicken." "It's a *multi-purpose* rubber chicken. Kids love chickens. While they're laughing, we grab the loot." "You want to rob children using a squeaky toy." "Molly, I want to rob children using *psychological misdirection*. The chicken's just the delivery system." She sighed, staring at the ceiling of the van like it held answers. "If we get caught, I'm blaming you." "If we get caught, we'll be arrested by an officer wearing a Paw Patrol badge sticker. I think we'll manage." "And if there's a piñata malfunction?" "There's always a piñata malfunction. That's when we strike." Molly looked at him for a long moment. "We're really doing this." "We're really doing this." "For the record, I hate every part of this plan." Bill smiled, tucking the rubber chicken under his arm like a briefcase. "That's the spirit, partner. Let's go rob some six-year-olds."
created_at
"2026-07-01T07:22:16.336Z"
7/1/2026, 7:22:16 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2703
moderation_latency
(null)
generation_time
18233
tokens_prompt
77
tokens_completion
854
native_tokens_prompt
93
native_tokens_completion
1175
native_tokens_completion_images
(null)
native_tokens_reasoning
48
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011936
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01LncJyp85ZNYox1sYHk5DuH"
is_byok
false
latency
1437
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890536-Wj4YEMZjVSFR9sNCgSC9"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890536-59ZGl7zQCYngv3CZPclO"
upstream_id
"msg_01LncJyp85ZNYox1sYHk5DuH"
total_cost
0.011936
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
27.17%
Matches word count
n/a
neededClean
false
words
481
0.93%
Dialogue to Total Word Ratio
Ratio: 79.67%, Deviation: 9.67%
neededClean
false
wordsTotal
487
wordsDialogue
388
14.0483%