NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEAB2HGGNT32RZXNEP4G88H
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
18.5s
Cost:
$0.011206
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
40.6364%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the window of the Rusty Spoon Diner, and Bill was already three refills into his coffee, spreading a crumpled napkin map across the table like it was the Mona Lisa. "So here's the plan," Bill said, jabbing a finger at a drawing that looked suspiciously like a stick figure holding a balloon. "We go in as clowns." Molly stared at him. "Clowns." "Clowns, Molly. Nobody suspects a clown." "Bill, everybody suspects a clown. That's the entire point of clowns. They're suspicious by design. Red nose, giant shoes, face paint hiding their intentions—it's basically a criminal uniform." "Fine. Magicians, then." "Worse. Magicians literally make things disappear. You want to rob a seven-year-old's birthday party dressed as the one profession synonymous with theft?" Bill slumped back into the vinyl booth, wounded. "You always shoot down my ideas." "I shoot down your bad ideas. It's a subcategory." Molly pulled the napkin toward her and flipped it over to the blank side. "Here's what we do. We're caterers. Nobody looks twice at caterers. We roll in with trays, we look bored, we complain about parking. Instant invisibility." "Caterers don't leave with presents, Molly. They leave with empty trays and a grudge against children." "That's why we bring a linen cart. Big rolling thing, skirted to the floor, looks like it's full of tablecloths and napkin rings. Really, it's full of birthday loot." She tapped the table. "Timmy turns eight, house is packed with sugar-drunk kids, parents are three glasses of chardonnay into pretending they like each other's spouses. Nobody's watching the linen cart." Bill considered this while conducting invisible calculations with a spoon. "There's gonna be a bounce house." "There's always a bounce house." "Bounce houses are loud. Generators. Big fans. Covers noise." "See, now you're thinking." "And the piñata," Bill added, warming up. "Piñata's a distraction event. Everybody's eyes go up when that thing breaks. Candy rains down, kids go feral, parents scramble to manage the feral children. That's our window." "Ten seconds, tops." "Ten seconds is a lifetime, Molly. I once boosted a Rolex in four." "You told me that story wasn't true." "It's mostly true." Molly rolled her eyes and slid out of the booth, tossing a few crumpled bills on the table. "Fine. Caterers, linen cart, piñata's our green light. But Bill—no balloon animals this time." "That was one incident." "You gave a kid a balloon sword and he stabbed the birthday cake." "He had *vision*." "He had a weapon, Bill. A weapon made of latex, at a party we were robbing, which drew attention directly to us." Bill folded the napkin carefully into his jacket pocket like it was a sacred document. "No balloon animals. Got it. What about balloon hats?" Molly was already walking toward the door. "We'll revisit hats." Outside, the rain kept falling, and somewhere across town, a bounce house was being inflated for its final, unsuspecting birthday.
created_at
"2026-07-01T07:47:18.717Z"
7/1/2026, 7:47:18 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1863
moderation_latency
(null)
generation_time
18455
tokens_prompt
77
tokens_completion
795
native_tokens_prompt
93
native_tokens_completion
1102
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011206
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01TKWtndmgKJjJ3yh9PgiZQg"
is_byok
false
latency
1863
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782892038-GgeAgCilESZZaOE5lVG2"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782892038-wXl1W1rZV5XkMV1p812L"
upstream_id
"msg_01TKWtndmgKJjJ3yh9PgiZQg"
total_cost
0.011206
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
488
0.00%
Dialogue to Total Word Ratio
Ratio: 68.23%, Deviation: 38.23%
neededClean
false
wordsTotal
491
wordsDialogue
335
40.6364%