NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ955QYQA549PSNKP7Z0SZD
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
18.3s
Cost:
$0.010575
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
6.03%
Dialogue to Total Word Ratio
(details)
40.5921%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the windshield of Bill's rusted van as he spread the hand-drawn map across the steering wheel. Molly sat in the passenger seat wearing a black turtleneck, which she had insisted was absolutely necessary for the operation despite it being seventy-eight degrees outside. "Okay, walk me through the plan one more time," Molly said, pulling the turtleneck down from her chin where it had been riding up. "We go in through the back gate," Bill said, tapping the map with a stubby finger, "during the magician's act. Kids will be distracted." "What if the magician is bad? What if the kids aren't distracted?" "All magicians at kids' parties are bad, Molly. That is a scientific fact." She studied the crayon drawing he had produced, noting that he had labeled the bouncy castle as both a bouncy castle and a fortress, having crossed out the first label and written the second one thinking it sounded more professional. "I still think we should go in as clowns," Molly said. "We've been over this. Nobody trusts clowns." "Nobody suspects clowns either." "Those are not the same thing and the distinction matters enormously." Bill folded the map poorly, which was obvious because it ended up roughly the shape of a tennis ball. "The presents get loaded into my bag, we exit through the side fence, and we're gone before anyone cuts the cake." Molly pressed her nose against the rain-streaked window and peered toward the house, where colorful balloons were tied to the mailbox and a banner read HAPPY BIRTHDAY COOPER in letters that seemed aggressively cheerful. "How old is this Cooper kid anyway?" she asked. "Six." "Six." She let the number sit in the air for a moment. "We are stealing birthday presents from a six year old." "We are liberating merchandise from an unsecured venue." "That is the same thing, Bill." "Again, I really feel like you're missing the nuance here." The magician arrived in a small purple car and stepped out wearing a cape that got caught immediately in the car door. He spent forty-five seconds fighting his own cape before winning and striding toward the backyard with the dignity of a man who had absolutely not just lost a fight to clothing. "Okay," Molly said, watching him disappear around the corner. "I give him three minutes before the kids revolt." "I give him two," Bill said, pulling ski masks from the glove compartment. "We move on the rabbit. Magicians always do a rabbit. When those kids see that rabbit, chaos happens, and chaos is our friend." Molly took her ski mask and stared at it sadly. "I genuinely cannot believe this is my life." "You said you wanted adventure." "I meant hiking, Bill." "Well." He pulled his mask on, and it went slightly sideways so one eye hole sat near his ear. "This is more or less the same thing."
created_at
"2026-02-18T02:27:25.484507+00:00"
2/18/2026, 2:27:25 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
519
moderation_latency
(null)
generation_time
18238
tokens_prompt
77
tokens_completion
739
native_tokens_prompt
75
native_tokens_completion
690
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.010575
router
(null)
provider_responses
0
id
"msg_01DCifPZjyS1VgNnqo1GNL8b"
status
200
is_byok
false
latency
516
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381626-r1m7fY6mdeaetWfFGx4M"
upstream_id
"msg_01DCifPZjyS1VgNnqo1GNL8b"
total_cost
0.010575
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
487
6.03%
Dialogue to Total Word Ratio
Ratio: 42.51%, Deviation: 7.49%
neededClean
false
wordsTotal
487
wordsDialogue
207
40.5921%