NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE90BF3SNJDY5Z9EZFYW334
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
17.4s
Cost:
$0.011476
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
0.01%
Matches word count
(details)
0.07%
Dialogue to Total Word Ratio
(details)
0.0402%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of the pawn shop smelled like mothballs and bad decisions. Molly spread the blueprint across the table—if you could call a crayon drawing of a house a blueprint. "Bill, I need you to focus. This is a seven-year-old's birthday party. Maximum security is a golden retriever named Biscuit." "Dogs are unpredictable, Molly. I got bit by a chihuahua in '09. Still got the scar." "That was a hamster, and you stepped on it." "Semantics." Molly tapped the drawing. "Here's the plan. Front door's got a bounce house blocking half the entry. We go in through the side gate, wearing the clown costumes—" "Absolutely not. I told you, I've got a thing about clowns." "You ARE the clown, Bill. That's the disguise." "I know what I am! I'm saying I don't want to look at myself in a mirror while wearing it. It's psychologically damaging." "You're a forty-three-year-old man planning to rob children." "There's a difference between committing crimes and confronting your fears, Molly. Some lines shouldn't be crossed." She rubbed her temples. "Fine. You wear the dinosaur costume. I'll be the clown." "That works. Dinosaurs are noble." "They're extinct, Bill." "Noble AND extinct. Two great tastes." Molly moved a plastic action figure across the table like a chess piece. "Okay, focus. Presents are on the gift table, back left corner, next to the cake. We've got maybe a ninety-second window when they cut the cake and everyone rushes over." "Ninety seconds to grab, what, thirty presents?" "Twenty-six. I counted the RSVPs on the invitation Facebook post." "You're stalking seven-year-olds now?" "I'm doing RECON. There's a difference." "Is there, though?" "Shut up. The presents are labeled by size—big ones probably bikes or scooters, those we leave, too conspicuous. We want the mid-size boxes. Nintendo Switches, LEGO sets, that sort of thing. High resale value, low carrying weight." Bill leaned back, arms crossed. "You've really thought this through." "Someone has to. Last job you tried to steal a parking meter." "It looked lonely." "It was BOLTED to the sidewalk, Bill." "So was I, emotionally, that day. We all cope differently." Molly pulled out two costume bags and tossed one at him. "Here's yours. Try not to trip on the tail this time." "That was ONE time, and the goat startled me." "There was no goat." "There's ALWAYS a goat, Molly. That's what makes it scary." She zipped up her clown bag with a sigh. "Just remember—we're in, we grab, we're out before the piñata even breaks." "And if a kid asks why we're stealing their presents?" Molly paused, considering this for the first time. "We tell them Santa's doing an audit." "In July?" "Kids don't know calendars, Bill." "That's... actually diabolical." "Thank you. Now put on the dinosaur head. We've got a birthday to ruin."
created_at
"2026-07-01T07:23:58.833Z"
7/1/2026, 7:23:58 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2787
moderation_latency
(null)
generation_time
17376
tokens_prompt
77
tokens_completion
812
native_tokens_prompt
93
native_tokens_completion
1129
native_tokens_completion_images
(null)
native_tokens_reasoning
44
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011476
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01AsrU6QjKFFiRT2W6nS2eMi"
is_byok
false
latency
1592
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890638-zHqh8hcJpH3SIJRyMNou"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890638-gvFTpfWL7eT2eF6uHNOx"
upstream_id
"msg_01AsrU6QjKFFiRT2W6nS2eMi"
total_cost
0.011476
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
0.01%
Matches word count
n/a
neededClean
false
words
469
0.07%
Dialogue to Total Word Ratio
Ratio: 82.05%, Deviation: 12.05%
neededClean
false
wordsTotal
479
wordsDialogue
393
0.0402%