NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ95441QYB2GQ60F86S7AMQ
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
7.1s
Cost:
$0.005085
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
54.09%
Dialogue to Total Word Ratio
(details)
75.0363%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unrolled the crayon-drawn map across the hood of their getaway minivan. She had spent three hours sketching the layout of the Hendersons' backyard, complete with little stick figures representing the enemy: twelve seven-year-olds hopped up on birthday cake. "The piƱata is our biggest problem," she whispered. Bill squinted at the map. "Why did you draw it with a face?" "Because it has a face, Bill. It's shaped like a dinosaur." "Right, but you gave it angry eyebrows." "I was trying to convey the threat level." Bill adjusted his black beanie and peered through the fence slats. Inside, children shrieked and chased each other with streamers. A mountain of gifts sat unattended near the gift table. "We go in during musical chairs," Molly said. "Everyone watches musical chairs." "Even the parents?" "Especially the parents. They get weirdly competitive." Bill nodded slowly, absorbing the genius of her strategy. "And the presents?" "Duffel bags. We're in and out in ninety seconds." "What if someone sees us?" "We tell them we're clowns." "We don't look like clowns." Molly stared at him. "Bill, you're wearing face paint." Bill touched his cheek. "Oh. Right."
created_at
"2026-02-18T02:27:12.342918+00:00"
2/18/2026, 2:27:12 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
565
moderation_latency
185
generation_time
6813
tokens_prompt
77
tokens_completion
322
native_tokens_prompt
75
native_tokens_completion
324
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.005085
router
(null)
provider_responses
0
id
"9765673a-de3c-4605-a03a-c222b4a62404"
status
200
is_byok
false
latency
564
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381624-uak0P3ETzuWehnBGB0wD"
upstream_id
"9765673a-de3c-4605-a03a-c222b4a62404"
total_cost
0.005085
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
192
54.09%
Dialogue to Total Word Ratio
Ratio: 58.85%, Deviation: 8.85%
neededClean
false
wordsTotal
192
wordsDialogue
113
75.0363%