NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE8SSHDXR2YFA8YAKBH21ZK
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
9.0s
Cost:
$0.004826
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
0.19%
Dialogue to Total Word Ratio
(details)
48.9088%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the hood of the getaway van—a stolen minivan with a cartoon dolphin decal peeling off the side. Bill leaned over her shoulder, chewing on a licorice whip like it was a cigar, squinting at the crude drawing of the Hendricks' backyard. "Bouncy castle's the main obstacle," Molly said, tapping the paper. "Six-year-olds move fast when there's cake involved." Bill nodded gravely, as if she'd just described a minefield. "And the clown?" "Rented. Leaves at four." They'd been planning this heist for three weeks—longer than they'd spent on the Fairmont jewelry job. The target wasn't diamonds this time. It was presents. Forty-three wrapped boxes sitting under a Paw Patrol banner, victims of a birthday party with criminally lax security. Bill adjusted his fake mustache, the one meant to make him look like "Uncle from out of town." Molly had dressed as a face-painting artist, complete with a kit full of glitter she intended to use as a smoke screen, quite literally. "Remember," she said, snapping the kit shut, "we're not monsters. We leave the birthday kid's gifts." Bill looked wounded. "What kind of thief steals from a six-year-old on their actual birthday?" "The amateur kind," Molly said, and started the van.
created_at
"2026-07-01T07:20:23.866Z"
7/1/2026, 7:20:23 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1759
moderation_latency
(null)
generation_time
8964
tokens_prompt
77
tokens_completion
329
native_tokens_prompt
93
native_tokens_completion
464
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.004826
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_0181doyDDTiwjSbLHJP9xeUx"
is_byok
false
latency
1680
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890423-m9sxzgdS2Gub5RwYZfI9"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890423-JDd1FejS1t4VunBCdVSz"
upstream_id
"msg_0181doyDDTiwjSbLHJP9xeUx"
total_cost
0.004826
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
207
0.19%
Dialogue to Total Word Ratio
Ratio: 25.82%, Deviation: 15.82%
neededClean
false
wordsTotal
213
wordsDialogue
55
48.9088%