NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9TS46VAE6KTMCNHV73X63
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
17.4s
Cost:
$0.010916
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
3.62%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
1.8117%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the windshield of the stolen minivan—a beige Toyota Sienna that screamed "soccer mom" and not "criminal mastermind," which was exactly the point. Molly sat in the passenger seat, studying a crumpled piece of paper covered in crayon-like scribbles that were, in fact, her own handwriting. Bill adjusted the fake mustache in the rearview mirror for the fourth time. It kept sliding off center, giving him the appearance of a man having a minor stroke. "Explain to me again," Bill said, "why we're robbing a seven-year-old's birthday party." "Because," Molly said, tapping the paper with authority, "Tommy Whitmore's parents are loaded, and this kid apparently asked for—and received—an actual PlayStation, a drone, two hundred dollars in gift cards, and something called a 'hoverboard,' which I still think is witchcraft." "So we're stealing from children now. Real proud moment for us." "We're not stealing from children, Bill. We're stealing from children's parents, who happen to be storing said loot in a poorly guarded pile on a folding table." Bill considered this distinction with the seriousness of a man deciding whether a taco counts as a sandwich. "Fine. But I still don't like the clown costume." "You picked the clown costume." "I picked it because you said the theme was 'circus,' and now I find out it's actually 'dinosaurs,' and I look like I wandered into the wrong birthday entirely." "You look like a very sad, very confused balloon animal. It's perfect. Nobody questions a sad clown at a party. They just feel bad for him and give him cake." Bill sighed, the kind of sigh that carried the weight of every terrible decision that had led him to this moment—divorced, broke, and about to burglarize a bounce house. "And what's your disguise, exactly?" Molly held up a name tag that read HELLO MY NAME IS: AUNT DEBBIE. "I'm Aunt Debbie. Nobody ever questions Aunt Debbie. Every family has one. She shows up late, drinks too much wine, and nobody remembers if she's actually related to anyone." "That's not a disguise, that's a personality you already have." "Focus, Bill. The plan is simple. We go in, we blend, we grab the presents during the piñata distraction, we leave. In and out in fifteen minutes." "And if someone asks why Aunt Debbie is loading a hoverboard into a minivan?" "Then Aunt Debbie says she's taking it to get 'gift wrapped properly,' because apparently these parents wrapped a four-hundred-dollar drone in newspaper like savages." Bill stared at her for a long moment, mustache drooping tragically. "We're going to prison because of a hoverboard." "We're going to Cabo because of a hoverboard. Focus on the positives." Molly snapped a rubber nose onto Bill's face with more force than necessary. "Now get out there and make some balloon animals, Debbie's got a party to crash."
created_at
"2026-07-01T07:38:24.787Z"
7/1/2026, 7:38:24 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3093
moderation_latency
(null)
generation_time
17348
tokens_prompt
77
tokens_completion
790
native_tokens_prompt
93
native_tokens_completion
1073
native_tokens_completion_images
(null)
native_tokens_reasoning
47
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010916
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01Wy2UnmvVsrTxpKjpG73Aai"
is_byok
false
latency
1667
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891504-x7CrVy8yp0ReHLf5kGaD"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891504-LsoHgLoK4rlZiziK4MOo"
upstream_id
"msg_01Wy2UnmvVsrTxpKjpG73Aai"
total_cost
0.010916
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
3.62%
Matches word count
n/a
neededClean
false
words
476
0.00%
Dialogue to Total Word Ratio
Ratio: 64.58%, Deviation: 34.58%
neededClean
false
wordsTotal
480
wordsDialogue
310
1.8117%