NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE8W7PEQ2FQ9XM4X12JJ36J
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
9.0s
Cost:
$0.004666
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9995%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled like stale beer and ambition, which described Molly and Bill's entire criminal partnership rather well. Spread across a card table were architectural blueprints of the Hendricks' backyard, a birthday party guest list, and a suspicious number of juice boxes. "Explain to me again," Molly said, tracing a finger along the fence line, "why we're robbing seven-year-olds." Bill leaned back, looking wounded. "We're not robbing seven-year-olds. We're robbing their parents. The kids just happen to be standing near the loot." Molly stared at him. "It's basically a technicality," he added. The plan, such as it was, involved Bill dressed as a clown—a costume he insisted made him "invisible" to authority figures—while Molly infiltrated as a caterer's assistant carrying an empty cake box roughly the size of a Honda Civic. Inside, they would smuggle out the presents piled beneath a Paw Patrol banner, presents that, according to their intel, included a diamond bracelet some idiot uncle had wrapped in Spider-Man paper as a joke gift. "There's going to be a bounce house," Bill warned. "I've survived worse." "There's also going to be a magician." Molly finally looked up. "Now that's just cruel." Outside, a car door slammed. Showtime.
created_at
"2026-07-01T07:21:43.899Z"
7/1/2026, 7:21:43 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2241
moderation_latency
(null)
generation_time
8963
tokens_prompt
77
tokens_completion
333
native_tokens_prompt
93
native_tokens_completion
448
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.004666
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_015mgDfZUcxTV5TNKwy7yCDC"
is_byok
false
latency
2241
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890503-9TSSLhYfcOIL0C05hZ77"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890503-fSjUPTccpHEKn6US21W8"
upstream_id
"msg_015mgDfZUcxTV5TNKwy7yCDC"
total_cost
0.004666
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
201
0.00%
Dialogue to Total Word Ratio
Ratio: 33.98%, Deviation: 23.98%
neededClean
false
wordsTotal
206
wordsDialogue
70
49.9995%