NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ92RWZ1CH04B8CAMYN8ZH2
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
16.1s
Cost:
$0.010185
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
87.13%
Dialogue to Total Word Ratio
(details)
92.9213%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain tapped against the window of Bill's van like an impatient customer waiting for a table. Molly spread the hand-drawn map across the dashboard, knocking an empty coffee cup onto the floor where it joined several of its deceased brothers. "The bouncy castle goes up at noon," Molly said, tapping the paper with her pen. "That's our window." Bill squinted at the map. He had the face of a man who had made many questionable decisions and was actively in the process of making another one. "Why is the birthday kid drawn as a dragon?" "Because I don't know what he looks like and I panicked. Focus." The target was a suburban backyard birthday party for one Theo Pemberton, who was turning seven and had apparently been gifted with the kind of parents who invited forty-five children and hired a magician named The Spectacular Dennis. Bill had spotted the invitation on the ground outside the post office three days ago and, rather than doing the reasonable thing, had called Molly immediately. They had been thieves for eleven years. They had robbed a bank, three jewelry stores, and one particularly confusing antique market where nothing had been labeled correctly. Stealing birthday presents from a seven-year-old was undeniably a new low, though neither of them was willing to say it out loud. "The presents get stacked near the back fence," Molly continued. "Parents always do that so the kids don't tear into everything before cake. We go over the fence, load the bag, and we're out before The Spectacular Dennis finishes his second act." Bill rubbed his jaw. "What does a children's magician do in a second act?" "I assume he pulls out a slightly larger rabbit." She had timed everything. The catering van arrived at eleven forty-five, which would create a distraction at the front gate. The bouncy castle inflating would mask the sound of them landing on the other side of the fence. She had even researched Dennis, whose real name was Gary, and knew his set ran approximately forty-two minutes including audience participation. "We're stealing toys from children," Bill said. "We're redistributing presents from an affluent neighborhood into the general economy." "Is the general economy a pawn shop on Fletcher Street?" "The economy is a complicated system, Bill." He looked at the map again. At the dragon. At the small square labeled CAKE which Molly had drawn with unnecessary detail including what appeared to be fondant decorations. "You drew the cake pretty well," he said. "I got excited and lost focus for a moment." Bill folded his arms and stared through the rain-streaked windshield at the quiet street ahead. A dog walked past without a care in the world, which felt deeply unfair. He thought about his choices. He thought about the magician named Gary. He thought about forty-five children and one dragon-shaped birthday boy. "Right then," he said, and started the van.
created_at
"2026-02-18T02:26:04.517538+00:00"
2/18/2026, 2:26:04 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1134
moderation_latency
239
generation_time
15687
tokens_prompt
77
tokens_completion
756
native_tokens_prompt
75
native_tokens_completion
664
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.010185
router
(null)
provider_responses
0
id
"c62ee2b8-04a6-4583-bb97-3d2ab9a00597"
status
200
is_byok
false
latency
1134
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381547-fc1f4nUA2IF2i1ReqHo6"
upstream_id
"c62ee2b8-04a6-4583-bb97-3d2ab9a00597"
total_cost
0.010185
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
494
87.13%
Dialogue to Total Word Ratio
Ratio: 28.34%, Deviation: 1.66%
neededClean
false
wordsTotal
494
wordsDialogue
140
92.9213%