NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9WACZZP6K5311AVQ3DDXG
Run Details
Model:
Claude Sonnet 5
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
9.8s
Cost:
$0.005036
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
50.57%
Dialogue to Total Word Ratio
(details)
46.9753%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Okay, so run it by me again," Molly said, "because last time you had us disguised as clowns and that ended with you in a headlock." "That kid bit like a wolverine, how was I supposed to know?" "You were supposed to know because they're children, Bill. Small, feral, and quick." "Fine. New plan. We're the bouncy castle guys." "We don't own a bouncy castle." "We rent one. Cash only. Set it up, wait till everybody's distracted bouncing their brains loose, then we grab the presents from the gift table and we're gone before cake." "And if a parent asks who we are?" "We say we're with Jumpin' Jerry's Party Rentals." "Is that a real company?" "It is now. I made business cards." "You made business cards." "Laminated, even. Very professional." "Bill, this is a six-year-old's birthday party. There will be balloons shaped like dinosaurs and a bounce house shaped like a castle and forty presents wrapped in paper with actual unicorns on it." "Exactly. Nobody suspects a unicorn heist." "That's not a real category of crime." "It is today." "You're impossible." "I'm ambitious. There's a difference." "There really isn't." "Are you in or not?" "...Fine. But I'm not wearing the Jerry costume." "Nobody said anything about a costume." "There's a costume, isn't there." "There's a costume."
created_at
"2026-07-01T07:39:15.244Z"
7/1/2026, 7:39:15 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2014
moderation_latency
(null)
generation_time
9717
tokens_prompt
77
tokens_completion
363
native_tokens_prompt
93
native_tokens_completion
485
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.005036
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01B8FWGqtcrUrKR4vUuAh81x"
is_byok
false
latency
2014
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891555-Haf44JVZb9Oj696hWCkN"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891555-libjj8DQ31cNBt4DoyDs"
upstream_id
"msg_01B8FWGqtcrUrKR4vUuAh81x"
total_cost
0.005036
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
217
50.57%
Dialogue to Total Word Ratio
Ratio: 99.09%, Deviation: 9.09%
neededClean
false
wordsTotal
219
wordsDialogue
217
46.9753%