NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE97A8YE8YKVFBYA9G6F05E
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
17.6s
Cost:
$0.010726
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
22.06%
Dialogue to Total Word Ratio
(details)
11.0310%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of Bill's apartment smelled like stale pizza and desperation. Molly slapped a crumpled blueprint down on the coffee table—it was actually a place mat from Chuck E. Cheese, but the crayon drawings on the back would have to do. "Okay, so the party starts at two," Molly said, tapping a Ninja Turtle with her finger. "Bouncy castle here, cake table there, and the presents—the presents, Bill—are stacked by the sliding glass door." Bill leaned back, arms crossed. "Molly, I gotta ask. Are we really doing this? We're stealing from six-year-olds." "Seven-year-olds. It's Timmy Radcliffe's seventh birthday." "Even worse. That's like, prime Nintendo Switch territory." "Exactly!" Molly's eyes lit up. "Do you know what a Switch goes for on the street? Fifteen, twenty bucks minimum." "That's not a lot of money for a felony, Molly." "It adds up! Plus there'll be, what, twenty kids? Each with a present? That's twenty presents, Bill. Twenty." Bill rubbed his temples. "Half of those are gonna be socks. Aunt Carol always gets socks." "Then we sell the socks." "Who buys used kid socks off the black market?" "I don't know, Bill, people are into weird stuff!" There was a pause. Bill picked up a juice box from the coffee table—leftover recon supplies—and took a long, contemplative sip. "What's our cover?" he finally asked. "I already got it handled." Molly reached into her bag and pulled out two rainbow wigs and a pair of oversized shoes. "We're clowns." Bill stared at the wig like it had personally insulted his mother. "No." "Bill—" "No clowns. I told you after the bank job. No clowns." "That was different! That was a heist gone wrong because Gary panicked and started making balloon animals during the getaway." "He made me a balloon dog while the cops were literally behind us, Molly!" "That's a Gary problem, not a clown problem!" Bill groaned and dropped his head into his hands. "Fine. Fine! We're clowns. But I'm not doing the honking nose." "The honking nose is the whole point, Bill. It builds trust." "It builds trust with children. We are trying to rob children." "Robbing them while they trust us is basically the entire heist industry, Bill, keep up." Bill sighed, grabbing the wig anyway and jamming it onto his head like a man accepting his fate. "How do we even carry twenty presents out of there without someone noticing two clowns waddling off with an armful of Legos?" Molly grinned, pulling out a duffel bag shaped suspiciously like a piñata. "We tell them it's a magic trick." "...That's actually not bad." "I know. Now put on the nose." "I said no nose." "Bill. The nose builds trust." He put on the nose.
created_at
"2026-07-01T07:27:46.986Z"
7/1/2026, 7:27:46 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2923
moderation_latency
(null)
generation_time
17581
tokens_prompt
77
tokens_completion
766
native_tokens_prompt
93
native_tokens_completion
1054
native_tokens_completion_images
(null)
native_tokens_reasoning
45
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010726
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01FfMwGons4SzbHir6pPhMZ5"
is_byok
false
latency
2028
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890866-e1t1YKa1DClDtFC2AmvB"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890866-825NewPnaMwd9OvYN0be"
upstream_id
"msg_01FfMwGons4SzbHir6pPhMZ5"
total_cost
0.010726
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
458
22.06%
Dialogue to Total Word Ratio
Ratio: 64.50%, Deviation: 5.50%
neededClean
false
wordsTotal
462
wordsDialogue
298
11.0310%