NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9301V48DSRG8FZGQVCHX4
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
9.4s
Cost:
$0.004826
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8722%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly leaned against the van, tapping the blueprint of the Hendersons' backyard like it was a bank vault schematic. "So the presents are on the gift table," she said, "guarded by a bouncy castle." Bill snorted. "A bouncy castle isn't security, Molly." "It's a perimeter, Bill. Kids bounce, adults watch kids bounce, nobody watches the table." "You want us to dress as clowns." "I want us to *be* clowns. There's a difference. Method acting." Bill crossed his arms. "Last time you said 'method acting,' we ended up actually delivering a singing telegram." "That job paid, didn't it?" "We got tipped in Monopoly money!" Molly ignored him, unrolling a second sheet. "Cake cutting at three. That's our window. Everyone's distracted by frosting, we grab the loot, we're gone before 'Happy Birthday' finishes the second verse." "And if some kid notices two grown adults stuffing Legos into a duffel bag?" "We tell them we're from Amazon. Returns department." Bill stared at her. "That's not a real department." "It is now." He sighed, already reaching for the clown wig. "Why do I let you plan these things?" Molly grinned, tossing him a red nose. "Because I'm the only one who thinks big enough to rob a six-year-old."
created_at
"2026-07-01T07:25:25.447Z"
7/1/2026, 7:25:25 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1805
moderation_latency
(null)
generation_time
9354
tokens_prompt
77
tokens_completion
330
native_tokens_prompt
93
native_tokens_completion
464
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.004826
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01MMb1Wn9HDAfGWncTAmoPqE"
is_byok
false
latency
1805
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890725-Qgvt55KNVgyApGKBnvPO"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890725-HFuYKBjuJuhZhFZCtXSR"
upstream_id
"msg_01MMb1Wn9HDAfGWncTAmoPqE"
total_cost
0.004826
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
204
0.00%
Dialogue to Total Word Ratio
Ratio: 78.64%, Deviation: 28.64%
neededClean
false
wordsTotal
206
wordsDialogue
162
49.8722%