NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEAC0ESPJ91XFDRP6ZK70QX
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.3s
Cost:
$0.012616
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
10.03%
Dialogue to Total Word Ratio
(details)
42.5932%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like old pizza and bad decisions, which was fitting, considering Bill had parked it three houses down from a birthday party decorated with more balloons than a parade float. Molly sat in the passenger seat, studying a crumpled printout of the house like it was a bank vault schematic. "Okay," she said, tapping the paper. "Presents are gonna be on the table by the bounce house. That's our target zone." Bill squinted at the paper, then at the house, then back at the paper like he was trying to solve a riddle. "Why are we robbing a six-year-old's birthday party again?" "Because, Bill, six-year-olds get the *best* loot. Nintendo Switches, Legos, those little robot dogs that cost more than my rent. Nobody robs kid parties. It's a blind spot in the criminal underworld." "That's because it's monstrous." "It's *efficient*." Bill sighed, rubbing his temples like he already regretted every choice that led him to this parked van. "There's gonna be like thirty screaming children in there." "Which is exactly why no one will notice two adults in dinosaur costumes." Bill turned slowly to look at the back seat, where two full-body inflatable T-Rex costumes sat folded like sad green laundry. "Absolutely not." "You said you wanted to blend in." "I wanted a *disguise*, Molly, not to become a birthday mascot!" "Same thing, different tail." Ten minutes later, Bill was wheezing inside the inflatable dinosaur suit, sweat already pooling somewhere he'd rather not think about, waddling toward the party with Molly beside him, her own T-Rex head slightly crooked so one eye stared permanently at the sky. "This was your idea," Bill grumbled through the mesh mouth opening. "Best idea I've ever had," Molly said proudly, though her voice echoed strangely from inside the costume, like a ghost narrating a nature documentary. They shuffled through the gate, past a woman in a "World's Best Mom" apron who beamed and clapped. "Oh my gosh, the dinosaurs are here! Right on time!" "Yep," Molly said cheerfully. "Rawr. For the kids. Very fun. Where are presents located, ma'am, for... entertainment purposes." The mom pointed toward a table buried under wrapped boxes and gift bags shaped like unicorns. Bill leaned toward Molly, his voice muffled. "This is way too easy." "Never say that during a heist, Bill. The universe is listening." As if summoned, a swarm of six-year-olds spotted them and shrieked with the kind of joy that could shatter glass, sprinting straight at the two "dinosaurs" with terrifying enthusiasm. "They think we're for hire!" Bill hissed, stumbling backward as three kids latched onto his tail. "Improvise!" Molly shouted, doing an awkward twirl that sent one child into fits of laughter and nearly toppled a folding table. Twenty minutes, four birthday songs, and one unfortunate limbo contest later, Bill collapsed onto a lawn chair, defeated, costume half-unzipped, revealing his sweat-drenched face. "We didn't steal a single present," he panted. Molly, sitting beside him with a juice box in one dinosaur claw, shrugged. "Yeah, but I got invited to do this again next month. Growth opportunity."
created_at
"2026-07-01T07:47:49.352Z"
7/1/2026, 7:47:49 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3097
moderation_latency
(null)
generation_time
19273
tokens_prompt
77
tokens_completion
900
native_tokens_prompt
93
native_tokens_completion
1243
native_tokens_completion_images
(null)
native_tokens_reasoning
73
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.012616
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01RUmXT7M1wWo5R2Kv6Tm39k"
is_byok
false
latency
1669
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782892069-VQJUJ3zQWGuZqKgDgdOu"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782892069-sH0tI8mPbm3EMgCPKwEV"
upstream_id
"msg_01RUmXT7M1wWo5R2Kv6Tm39k"
total_cost
0.012616
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
513
10.03%
Dialogue to Total Word Ratio
Ratio: 36.78%, Deviation: 6.78%
neededClean
false
wordsTotal
522
wordsDialogue
192
42.5932%