NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8X5V9G7Z3MQCTQTD3FJYY2
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 12s
Cost:
$0.106686
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
89.93%
Dialogue to Total Word Ratio
(details)
94.9643%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn blueprints across the hood of their rusted getaway van, a 1997 Dodge Caravan they had named The Phantom for reasons that made considerably more sense after three glasses of wine. The blueprints, rendered in crayon because Bill had once again misplaced the good markers, depicted what appeared to be a moderately sized suburban home with a large backyard. Someone had drawn a small smiling sun in the upper corner. That had been Bill. "The target is the Henderson place on Maple Grove," Molly said, tapping the paper. "Little Tyler Henderson turns seven tomorrow. Intel says there are roughly forty wrapped presents stacked in the living room." Bill squinted at the blueprints. "Which room is the living room?" "The one I labeled living room." "I thought that was a swimming pool." Molly rubbed her temples with the focused patience of a woman who had spent seventeen years in the criminal underworld and somehow found this partnership to be her most challenging assignment yet. Bill was not a stupid man. He simply existed at a slight perpendicular angle to reality, like a picture frame that no amount of nudging could ever straighten. They had been casing the Henderson property for four days. Molly had disguised herself as a jogger, which naturally required her to actually jog, something she had not fully anticipated and was still recovering from emotionally. Bill had attempted a disguise as a meter reader but had forgotten that meter readers generally carry equipment of some variety and had spent nearly forty uncomfortable minutes standing near the curb holding a blank clipboard until a sympathetic neighbor brought him lemonade and asked if he needed to sit down. "The party starts at two," Molly continued, unrolling a second set of notes with a grape juice stain obscuring what she believed was the back entrance. "The adults will be distracted. There is a clown booked. People always watch the clown." "I don't watch the clown," Bill said. "Nobody is asking you to watch the clown." "I'm just saying, if the clown notices us—" "The clown will not notice us. We move with precision and silence and absolute professionalism." She paused carefully. "We will also be wearing the matching cat burglar outfits I ordered online." Bill reached into the back of the van and produced two black bodysuits, each one featuring small decorative cat ears sewn into the hood. He held his against himself and studied his reflection in the van's side mirror with an expression suggesting a complicated internal negotiation. "Mine has a tail," he said finally. "They both have tails." "Yours has glitter." "I may have upgraded mine slightly." Bill set his suit back in the bag and gazed out at the quiet suburban street stretching before them. Somewhere behind those tidy hedgerows, forty wrapped presents sat in complete and innocent ignorance of what was coming their way. He cracked his knuckles with considerable ceremony. "I still think we should've just stolen the birthday cake," he said.
created_at
"2026-02-24T22:44:05.56Z"
2/24/2026, 10:44:05 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
723
moderation_latency
141
generation_time
71518
tokens_prompt
77
tokens_completion
2063
native_tokens_prompt
92
native_tokens_completion
7094
native_tokens_completion_images
(null)
native_tokens_reasoning
1274
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.106686
router
(null)
provider_responses
0
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
id
"6e9829c3-1757-46f2-af4e-c5b0649c0c40"
is_byok
false
latency
688
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771973045-bclJoWCEUvUdbj7q9eW0"
upstream_id
"6e9829c3-1757-46f2-af4e-c5b0649c0c40"
total_cost
0.106686
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
89.93%
Dialogue to Total Word Ratio
Ratio: 28.54%, Deviation: 1.46%
neededClean
false
wordsTotal
501
wordsDialogue
143
94.9643%