NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8VNE98WNGRZTSPK207QZPV
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
18.7s
Cost:
$0.012891
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
8.30%
Dialogue to Total Word Ratio
(details)
25.8396%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn blueprints across the hood of their rusted van and smoothed out the corners with her elbows. The papers were covered in crayon markings, which she felt was appropriate given the nature of the job. "The party starts at two," she said, tapping a wobbly circle she'd labeled BOUNCY CASTLE. "The presents will be piled in the living room by two-fifteen. We move at two-thirty." Bill squinted at the blueprint. "You drew this in crayon." "I was in character. We're going in as clowns, Bill. Method work." "I'm not wearing the wig again." "You're wearing the wig." Bill pulled his woolly hat down further over his ears and stared at the house across the street. It was a modest semi-detached home with balloons tied to the front gate and a banner that read HAPPY 7TH BIRTHDAY SOPHIE in glittery letters. A bouncy castle, exactly as predicted, was already inflating in the back garden with a sound like a dying whale. "How many kids are we talking?" Bill asked. "Fourteen confirmed RSVPs. Plus parents. The parents are the problem." "The seven-year-olds aren't the problem?" "Bill, I once stole a diamond necklace from a Monaco hotel guarded by three ex-military security personnel. I think I can handle Sophie's mum from Dunstable." Bill considered this. "Fair point. What's the extraction plan?" Molly rolled up the blueprints with the confidence of someone who had actually thought this through, which she had not entirely done. "We go in through the side gate during pass the parcel. Everyone is distracted, eyes on the middle of the room. You load the presents into the sacks. I keep watch by the patio doors." "And the dog?" Molly paused. "What dog?" "There's a dog." Bill pointed. A large golden retriever was currently eating one of the balloons tied to the gate with tremendous enthusiasm. "I saw it this morning. It's enormous and it looks stupid, which means it's friendly, which means it'll bark at absolutely everything." "We befriend the dog." "We befriend the dog," Bill repeated slowly. "That's the plan. Befriend the dog." "Do you have a better idea?" He did not have a better idea. He patted his jacket pockets and produced a slightly squashed sausage roll he'd been saving for lunch and held it up between them. "I suppose that works," Molly admitted. They sat in the van for another ten minutes watching the bouncy castle reach full inflation. A tiny child immediately ran outside and threw herself onto it with complete abandon, squealing with a joy so pure it was almost unsettling. Bill watched her bounce. "You know," he said, "we are absolutely terrible people." "Tremendously terrible," Molly agreed cheerfully. "Right then. Wigs on." Bill looked at the enormous curly orange wig sitting on the dashboard. He picked it up. He put it on. He looked in the wing mirror. "I hate this job," he said.
created_at
"2026-02-24T22:17:39.378Z"
2/24/2026, 10:17:39 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
614
moderation_latency
308
generation_time
18279
tokens_prompt
77
tokens_completion
859
native_tokens_prompt
92
native_tokens_completion
841
native_tokens_completion_images
(null)
native_tokens_reasoning
105
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.012891
router
(null)
provider_responses
0
endpoint_id
"83d802cb-44f2-45a0-a9d5-cb9bf79777c8"
id
"0a7705da-1dea-4bac-a245-4e2c31d6c981"
is_byok
false
latency
613
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771971459-cPnZdr82cjwfgZQRSN5o"
upstream_id
"0a7705da-1dea-4bac-a245-4e2c31d6c981"
total_cost
0.012891
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
8.30%
Dialogue to Total Word Ratio
Ratio: 42.94%, Deviation: 7.06%
neededClean
false
wordsTotal
489
wordsDialogue
210
25.8396%