NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8X572V9NJ83Q97Q5SFAWAH
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
22.0s
Cost:
$0.013266
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.07%
Dialogue to Total Word Ratio
(details)
21.7234%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the hood of the van and tapped it twice with a gloved finger. "The bouncy castle goes up at two. That's our window." Bill squinted at the map. A lopsided house sat in the middle surrounded by what appeared to be stick figures holding balloons. "Did you draw this yourself?" "I did reconnaissance." "You drew suns with smiley faces, Molly." "It was for cover. I looked like a neighborhood kid." "You're forty-three." "A youthful forty-three." She pulled the map closer and pointed. "Now listen. The presents go in the dining room. Parent supervision clusters near the cake table, which is here. The DJ, who is apparently a man named Funtastic Dave, sets up on the patio and keeps everyone distracted." Bill leaned against the van and crossed his arms. "We're robbing a seven-year-old's birthday party." "We're liberating merchandise from an unsecured venue." "There'll be children everywhere." "Children are easy to avoid. They travel in loud, predictable herds. You hear screaming, you go the other direction. Basic fieldwork." Bill rubbed his forehead slowly. "What about the parents?" "Two glasses of prosecco each and they'll be discussing their kitchen renovations. We'll be invisible." "And the dog I saw in the yard photos?" Molly hesitated for just a fraction of a second. "Buddy is a complication." "Buddy is a German Shepherd the size of a dining table." "Who reportedly loves hot dogs." She produced a zip-lock bag from her jacket pocket and held it up proudly. "I have eight." Bill stared at the bag. "You bought bribes for the dog." "I prefer the term diplomatic currency." "Molly, this is insane. Even for us." "Bill, do you know what's in that dining room right now? I did the guest list research. Forty-two children. Average gift value, thirty-five pounds. That's nearly fifteen hundred pounds in Lego, remote control cars, and craft sets we can flip before Monday morning." Bill was quiet for a moment. A pigeon landed on the van roof and regarded them both judgmentally before flying away. "I want sixty percent," Bill said finally. "Fifty-five." "Sixty. I'm the one dealing with Buddy." "Fine. Sixty. But you carry the bag." "Obviously I carry the bag, it's my bag now, I negotiated for it." Molly rolled up the crayon map and tucked it inside her jacket. She checked her watch, then looked across the street at the house where clusters of foil balloons were already bobbing cheerfully above the garden gate. The distant, tinny opening notes of a party playlist drifted through the afternoon air. "Funtastic Dave is early," she said. "Is that good or bad?" "For Dave, probably good. For us?" She pulled on her cap. "Game time." Bill grabbed the hot dogs and followed her across the road, absolutely certain this was the worst plan either of them had ever agreed to, and equally certain they were doing it anyway.
created_at
"2026-02-24T22:43:44.869Z"
2/24/2026, 10:43:44 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
770
moderation_latency
153
generation_time
21789
tokens_prompt
77
tokens_completion
886
native_tokens_prompt
92
native_tokens_completion
866
native_tokens_completion_images
(null)
native_tokens_reasoning
112
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.013266
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01BfycS5zzDokH16h4FurKoA"
is_byok
false
latency
770
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771973024-QBrkhCq4T5KekAPoJ27j"
upstream_id
"msg_01BfycS5zzDokH16h4FurKoA"
total_cost
0.013266
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
0.07%
Dialogue to Total Word Ratio
Ratio: 57.93%, Deviation: 12.07%
neededClean
false
wordsTotal
492
wordsDialogue
285
21.7234%