NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEA4EV8K2N8656DWBK81H0F
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
9.2s
Cost:
$0.005316
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
39.66%
Dialogue to Total Word Ratio
(details)
69.8202%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the kitchen table, a crumpled birthday invitation serving as their only intelligence. Bill leaned over her shoulder, chewing on a stale donut, squinting at the crayon-drawn map of the Hendersons' backyard. "Bounce house is the weak point," Molly muttered, tracing a finger along the fence line. "Nobody guards a bounce house." Bill nodded slowly, though his eyes had drifted toward the pile of wrapped presents illustrated in the corner of the invitation, complete with little exclamation points reading "SO MANY GIFTS!!" They'd cased the place for three days. Six-year-old Tommy's birthday party promised balloon animals, a magician named "The Astounding Reginald," and—most importantly—a mountain of loot from doting relatives who clearly overcompensated for divorce guilt. "Face paint," Bill said suddenly, tapping the plan. "We go in as face painters." Molly considered this. It was either brilliant or the dumbest idea she'd heard all month, possibly both. "Fine," she said. "But I'm not doing balloon animals. Last time I nearly lost a finger." Bill grinned, already reaching for the rainbow wig hanging on their doorknob. Somewhere out there, a six-year-old's birthday was about to become a masterclass in unexpected loss—and a very confusing police report.
created_at
"2026-07-01T07:43:41.937Z"
7/1/2026, 7:43:41 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3507
moderation_latency
(null)
generation_time
9202
tokens_prompt
77
tokens_completion
387
native_tokens_prompt
93
native_tokens_completion
513
native_tokens_completion_images
(null)
native_tokens_reasoning
48
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.005316
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01T7ZzwvtnGKptLh6wG6Khfx"
is_byok
false
latency
2022
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891821-GyP17UFfV3jtO9A9XKuo"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891821-tAhfGBTX75OApAYfdFXo"
upstream_id
"msg_01T7ZzwvtnGKptLh6wG6Khfx"
total_cost
0.005316
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
202
39.66%
Dialogue to Total Word Ratio
Ratio: 19.81%, Deviation: 9.81%
neededClean
false
wordsTotal
207
wordsDialogue
41
69.8202%