NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEA0TSM5MB77V7XJCS9QEBQ
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
19.3s
Cost:
$0.012576
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
0.03%
Dialogue to Total Word Ratio
(details)
37.5914%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled like old cardboard and Bill's cheap cologne, which Molly had complained about every single day for the past three years. She spread the blueprint across a folding table—not blueprints, really, just a crayon drawing of a house that she'd bribed a seven-year-old to make. "This is our target," Molly said, tapping the drawing. "The Hendricksons. Timmy's turning eight. Bounce house, clown, the works." Bill squinted at the paper. "Why does the clown have three eyes?" "Kids can't draw, Bill." "That's disturbing." "Focus." Molly pulled out a second sheet, this one covered in more legitimate-looking notes. "Presents get piled on the dining room table until the birthday song. That's our window. Fourteen minutes, tops, while everyone sings and Timmy blows out candles." "Fourteen minutes to clear a table of Legos and Barbies," Bill said, rubbing his chin. "Feels tight." "We've robbed banks in less time." "Banks don't have bounce houses full of screaming children who might identify us later in a lineup." "They will not be identifying anyone. We're wearing the costumes." Bill looked up slowly. "What costumes." Molly slid two folded outfits across the table. Bill unfurled his and held up a full-body dinosaur suit in a shade of purple that did not occur in nature. "I am not wearing this." "You're the entertainment, Bill. Barney's cousin. Barnaby. Nobody questions Barnaby." "Everybody will question Barnaby. Barnaby looks like he's been in a car accident." "That's the point! Distraction. Parents will be so busy explaining to their kids why the dinosaur looks sad, they won't notice me backing a delivery van into the driveway." Bill held the suit at arm's length like it might bite him. "And what am I supposed to do in this? Dance?" "You're going to do the Macarena. Slowly. Menacingly." "There is no menacing way to do the Macarena." "There is now." He sighed, already knowing this argument was lost, the way all arguments with Molly were eventually lost—through sheer exhaustion rather than persuasion. "Fine. But if this goes sideways, I want it on record that I said the dinosaur idea was insane." "Duly noted. Now, the getaway van—" "Also insane, since it says 'Big Bill's Bounce House Rentals' on the side, which is not a real company." "I registered it last week. It's very real. I have a website." Bill stared at her. "You built a website for a fake company to rob an eight-year-old's birthday party." "Presentation matters, Bill." "You know we're stealing Legos, right? Not the crown jewels." Molly rolled up the crayon blueprint with the solemnity of a general folding a battle map. "Timmy," she said, "has a limited edition Star Wars set worth four hundred dollars on the resale market. This is not amateur hour." Bill looked down at the dinosaur suit, then back at her, then at the suit again. "The tail's got a hole in it," he said. "Then don't bend over." "I'm going to be bending over the entire time doing the Macarena." "Then I guess," Molly said, already walking away, "you'll just have to be careful."
created_at
"2026-07-01T07:41:43.111Z"
7/1/2026, 7:41:43 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3600
moderation_latency
(null)
generation_time
19174
tokens_prompt
77
tokens_completion
911
native_tokens_prompt
93
native_tokens_completion
1239
native_tokens_completion_images
(null)
native_tokens_reasoning
90
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.012576
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01GqwwA8yPCxQ7Dpqj9rC4Ew"
is_byok
false
latency
1746
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891703-mpVRvTqs5LamZY1QZusk"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891703-4awEL6w6ujRz8rmiLzvH"
upstream_id
"msg_01GqwwA8yPCxQ7Dpqj9rC4Ew"
total_cost
0.012576
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
513
0.03%
Dialogue to Total Word Ratio
Ratio: 62.81%, Deviation: 12.81%
neededClean
false
wordsTotal
519
wordsDialogue
326
37.5914%