NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ9308CE969A9KFNAPJCCEQ
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
15.7s
Cost:
$0.010815
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
0.01%
Dialogue to Total Word Ratio
(details)
45.2476%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the windshield of Bill's rusted Pontiac, which was parked conspicuously across the street from 14 Birchwood Lane. Inside, a child's birthday party was apparently in full swing, evidenced by the sound of shrieking seven-year-olds that somehow penetrated two hundred feet of soggy suburban air. Balloons in aggressive primary colors bobbed against the living room window like they were taunting the two criminals hunched over a hand-drawn map in the front seat. Molly pressed her finger against the paper. "The presents will be stacked near the fireplace. They always stack them near the fireplace." Bill squinted at her drawing, which looked less like a floor plan and more like a spider having a breakdown. "How do you know where they stack the presents?" "I was a child once, Bill." He considered arguing this point but decided against it. The plan, such as it was, involved waiting until the entertainer arrived. Molly had done her research, which meant she had driven past the house twice and read the invitation she'd stolen from the neighbor's recycling bin. A magician named The Spectacular Gerald was scheduled to perform at two o'clock, creating what Molly called a distraction window and what any reasonable person would call a man pulling scarves from his pocket while children screamed. Bill rubbed his jaw. "What if the dog is still in the yard?" "It's a Pomeranian named Cupcake." "Cupcakes can be vicious." Molly lowered her sunglasses to look at him directly, which was difficult since she was already wearing them inside a car on an overcast afternoon for reasons she had never satisfactorily explained. She pushed them back up without saying anything, which somehow communicated everything. They sat in silence watching a clown-themed delivery van pull up, which was either The Spectacular Gerald or the universe making things significantly worse. A small man in a purple vest climbed out carrying a suitcase that rattled ominously. Children immediately materialized at the window, pressing their faces against the glass like little flesh balloons of their own. This was the moment. Molly cracked her knuckles with great ceremony. Bill pulled his beanie down over his ears, which did nothing for his disguise but made him feel more criminal. They had rehearsed the extraction twice in Molly's kitchen using throw pillows as gifts, and both rehearsals had ended with Bill tripping over the cat. "You remember the signal?" Molly asked. "You whistle." "I whistle twice." "Right, twice." "The first whistle means I see someone coming. The second whistle means run." Bill nodded seriously. "What if you whistle once?" "Then I'm just whistling." They climbed out of the Pontiac with the quiet dignity of two people who had made genuinely poor life choices but remained deeply committed to them. The rain had softened to a mist. Somewhere inside the house, The Spectacular Gerald was already disappointing people. It was, by any professional measure, the perfect moment.
created_at
"2026-02-18T02:26:11.821824+00:00"
2/18/2026, 2:26:11 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
609
moderation_latency
115
generation_time
15478
tokens_prompt
77
tokens_completion
787
native_tokens_prompt
75
native_tokens_completion
706
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.010815
router
(null)
provider_responses
0
id
"1a86c35a-c2e5-47a5-aa82-7ba54a78be95"
status
200
is_byok
false
latency
607
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381555-SuY5TJjczgP6hKtU1kJI"
upstream_id
"1a86c35a-c2e5-47a5-aa82-7ba54a78be95"
total_cost
0.010815
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
490
0.01%
Dialogue to Total Word Ratio
Ratio: 16.53%, Deviation: 13.47%
neededClean
false
wordsTotal
490
wordsDialogue
81
45.2476%