NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ93XDH0VSQ7T4H5MZ9N5YR
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
19.5s
Cost:
$0.011625
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
99.74%
Dialogue to Total Word Ratio
(details)
71.5597%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like old cheese and broken dreams. Molly spread the hand-drawn map across the dashboard while Bill ate a bag of crisps with the enthusiasm of a man who had nothing to feel guilty about. "Right," Molly said, tapping the paper. "The party starts at two. We go in at three, when they're all sugared up and distracted." Bill crunched loudly. "What's the theme again?" "Dinosaurs." "I love dinosaurs." "Bill." "Just saying." Molly pressed her finger to a squiggly rectangle on the map. "The presents will be here. On the gift table. Adjacent to the bouncy castle." "How big's the bouncy castle?" "Why does that matter?" Bill shrugged with the energy of a man who asks questions for sport. "Tactical awareness, Molly. What if we need a quick escape route and the bouncy castle is blocking the exit?" "We walk around it." "Or," Bill raised a finger, "we go through it." Molly stared at him for a long moment. "We are not bouncing our way out of a heist." "Fastest route between two points." "That's not even remotely what that means." Bill finished his crisps and scrunched the bag noisily. "What are we wearing? Please don't say clown costumes again. My therapist and I are still unpacking the last time." "Party guests. Civilian clothes. Normal faces." "I only have the one face." "Then try to make it look like it belongs at a children's party and not in a police lineup." Bill touched his cheek, mildly offended. "Children love my face. I have a friendly face." "You have the face of someone who definitely took the last biscuit and blamed the dog." "That was one time." He paused. "And the dog looked guilty." Molly rolled up the map and pointed it at him. "Here's the plan. We arrive separately. You bring a gift so you blend in." "What kind of gift?" "Cheap. Wrapped." "Can it be dinosaur-themed?" "Bill, I will leave you in this van." "Fine. Boring gift. Got it." He looked out the window at the terraced house with the cluster of balloons tied to the gate. "Who are all these presents even from?" "Grandparents, aunties, school friends. The usual suspects." "Feels like a lot of effort for a six-year-old who won't remember any of it." Molly looked at him. "Are you having a moral crisis right now? Because your timing is genuinely spectacular." "Not a crisis. More of a philosophical wobble." "Wobble on your own time. We've got a window of eight minutes once the cake comes out. They sing, we lift, we leave." Bill nodded slowly. "What if the kid cries?" "Kids always cry at parties. That's basically the whole point of parties." "Fair." He straightened up and cracked his knuckles. "Okay. I'm focused. I'm professional. I'm ready." Molly opened the van door. "Try not to go on the bouncy castle." "Absolutely no promises."
created_at
"2026-02-18T02:26:45.45214+00:00"
2/18/2026, 2:26:45 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
841
moderation_latency
(null)
generation_time
19423
tokens_prompt
77
tokens_completion
753
native_tokens_prompt
75
native_tokens_completion
760
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.011625
router
(null)
provider_responses
0
id
"msg_01QBxkjEX2E8i1PgmZu5qxGB"
status
200
is_byok
false
latency
840
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381585-jersoc6OjP6QU04PFLdO"
upstream_id
"msg_01QBxkjEX2E8i1PgmZu5qxGB"
total_cost
0.011625
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
99.74%
Dialogue to Total Word Ratio
Ratio: 69.77%, Deviation: 0.23%
neededClean
false
wordsTotal
483
wordsDialogue
337
71.5597%