NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEAFTBWKHYSCY5MYCXCZ7JY
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
11.7s
Cost:
$0.004796
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
27.17%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
13.5829%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slid the blueprint across the table, a crayon drawing of a castle bounce house with a suspicious number of arrows pointing to a table labeled "PRESENTS!!!" in glitter glue. "This is our target?" Bill squinted. "A six-year-old drew our heist plan?" "Timmy's cousin. She's got an eye for security gaps." "Molly, there's a clown." "Bozo's a distraction. Sends the kids into hysterics every time he pulls out that horn." Bill rubbed his temples. "We're stealing birthday presents from children." "We're liberating unopened Nerf guns and something called a Bakugan, whatever that is." "This feels wrong." "You said that about the diamond heist too." "That was different! Diamonds don't have thank-you notes attached!" Molly grinned, tapping the drawing. "The gift table's unguarded during cake time. Ninety seconds, tops." "And if a parent notices?" "We're 'entertainment.' You're juggling. I'm doing balloon animals." "I don't know how to juggle." "You'll learn." "Molly—" "Bill. There's a PlayStation in that pile. Wrapped, but I can tell by the shape." Bill paused, considering. "...Fine. But I'm not wearing the clown wig." "Deal. I already claimed it."
created_at
"2026-07-01T07:49:54.182Z"
7/1/2026, 7:49:54 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4316
moderation_latency
(null)
generation_time
11669
tokens_prompt
77
tokens_completion
322
native_tokens_prompt
93
native_tokens_completion
461
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.004796
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01EenSSEx8KcE4busm8QjeE4"
is_byok
false
latency
3695
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782892194-0gobcEjMeQuUMS5E1QQo"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782892194-e2xiXgFDXpZ6RZXjlogr"
upstream_id
"msg_01EenSSEx8KcE4busm8QjeE4"
total_cost
0.004796
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
27.17%
Matches word count
n/a
neededClean
false
words
181
0.00%
Dialogue to Total Word Ratio
Ratio: 76.63%, Deviation: 26.63%
neededClean
false
wordsTotal
184
wordsDialogue
141
13.5829%