NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9W7S341475K5CTQZGXH7E
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18.0s
Cost:
$0.010856
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
83.88%
Dialogue to Total Word Ratio
(details)
41.9415%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of Bill's Pawn & Loan smelled like mothballs and bad decisions. Molly sat on an overturned crate, sharpening a butter knife for reasons Bill chose not to question. "A birthday party," Bill said, for the third time. "You want to rob a birthday party." "Not rob. Liberate. There's a difference." "Enlighten me." "Robbing implies I feel bad about it." Bill dropped into his chair, which groaned in solidarity with his life choices. "Molly, these are presents for a seven-year-old. Barbies. Legos. Maybe a hamster if the kid's lucky." "Do you know what a first-edition Silverback the Gorilla plush goes for on the collector's market? Four hundred dollars, Bill. Four hundred." "How do you even know that?" "I have a guy." "You have a guy for everything." "I have a guy for stuffed animals, yes. His name is Desmond. He's very sad about it." Bill rubbed his temples like he could physically massage the headache out. "So the plan is we just—what—walk into some six-year-old's birthday party dressed as clowns and grab the loot?" "First of all, seven-year-old. Second, clowns are amateur hour. Clowns get remembered. We go in as caterers." "We don't cater." "We deliver a cake. One cake. We set it down, we case the present table, we come back during the piñata chaos and grab everything." "Piñata chaos." "It's a known phenomenon, Bill. Every kid in a three-mile radius loses their mind the second that thing cracks open. Parents are filming, dodging candy shrapnel, nobody's watching a folding table covered in wrapping paper." "And the cake?" "Store-bought. I'm not baking for this. I have priorities." Bill stared at her. "You want to steal from children, but drawing the line at baking." "Homemade cake takes four hours. Theft takes four minutes. Efficiency, Bill." "This isn't efficiency, this is a felony with balloons." "Every felony has balloons if you're doing it right." Bill picked up the invitation Molly had somehow acquired — glittery, cartoon dinosaur, the words HAPPY 7TH BIRTHDAY ROCCO practically screaming with joy — and turned it over in his hands like it might explain itself. "Rocco," he said flatly. "Rocco. Loaded family. Uncle's some tech guy. You should see the gift list I intercepted. There's an Xbox in there, Bill. A kid getting an Xbox for his seventh birthday and he doesn't even understand marginal utility." "You're justifying grand theft using economics vocabulary." "I contain multitudes." Bill set the invitation down, exhaled long and slow, the exhale of a man who had made peace with several bad decisions already this year and apparently had room for one more. "Fine. But I'm not wearing the little chef hat." "The hat is non-negotiable, Bill." "The hat is ridiculous." "The hat says *trustworthy caterer*, not *man about to steal a seven-year-old's Xbox*. Wear the hat."
created_at
"2026-07-01T07:39:12.562Z"
7/1/2026, 7:39:12 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2362
moderation_latency
(null)
generation_time
17981
tokens_prompt
77
tokens_completion
766
native_tokens_prompt
93
native_tokens_completion
1067
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010856
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_015XZPXpanFiKo9tw8zdSx9j"
is_byok
false
latency
2362
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891552-ptqX4r2yT3trP8FZtuId"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891552-L2OUL8yWdVdGtHGVRCpB"
upstream_id
"msg_015XZPXpanFiKo9tw8zdSx9j"
total_cost
0.010856
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
468
83.88%
Dialogue to Total Word Ratio
Ratio: 71.88%, Deviation: 1.88%
neededClean
false
wordsTotal
480
wordsDialogue
345
41.9415%