NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXBH14S69TMTE47F070D29
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
22.8s
Cost:
$0.031065
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9609%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the table, smoothing it with her palm. "Okay, here's the deal. The Hendersons rented out the entire community center for little Timmy's birthday." "Timmy?" Bill leaned in, squinting. "We're robbing a kid named Timmy?" "We're not robbing Timmy. We're robbing the *presents.* There's a difference." "Is there, though?" Molly tapped the paper. "Focus. Forty kids, RSVP'd. That's forty gifts, minimum. You know what rich parents give these days? iPads. Drones. One kid's getting a pony." Bill perked up. "A pony?" "We're not stealing the pony, Bill." "Why not?" "Because we drive a Honda Civic." He sat back, deflated. "So what's the plan?" "Simple. We go in disguised as entertainment. I'll be the face painter. You'll be the clown." "Absolutely not." "Bill—" "I have a documented fear of clowns. *I'm* afraid of *clowns.* You want me to *become* the thing I fear?" "It's a costume." "It's a gateway to therapy." Molly pinched the bridge of her nose. "Fine. You're the magician. Magicians have big coats. Big coats have big pockets. You pocket the small stuff while I distract the parents." Bill considered this. "I do own a cape." "Why do you own a cape?" "That's between me and the cape." "Whatever. We arrive at two. Cake's at three. That's our window—everyone's singing, lights are low, candles are lit. Total chaos. We load the gifts into the van during 'Happy Birthday.'" "That song's only like thirty seconds long." "Then we sing it slow." Bill blinked. "We're singing?" "We're *committing.* We drag out every syllable. 'Haaaaappy biiiiirthday toooooo yooooou.' Buys us four minutes, easy." "And if a kid catches us?" Molly leaned in, dead serious. "Then you do a magic trick." "What trick?" "I don't know, pull a quarter from their ear. Kids love that garbage. They'll forget all about the man stuffing a Nintendo into his cape." Bill rubbed his chin. "And the getaway?" "Van's parked by the back exit, behind the bouncy castle. We're gone before they cut the cake." "What about goodie bags?" "What *about* goodie bags?" "I'm just saying, if we're already there—" "Bill, we are not taking the goodie bags." "They have those little spinny tops! And sometimes the chocolate coins—" "We're professionals." "Professionals enjoy chocolate coins!" Molly stared at him for a long moment. "Fine. You can have the goodie bags. But only if the heist goes clean." "Deal." He extended a hand. They shook. "One more thing," Molly said. "There's a piñata." "So?" "So when it breaks, candy goes everywhere. Forty kids hit the floor like piranhas. Use that. Nobody watches the gift table when there's free Smarties on the ground." Bill grinned. "You've thought of everything." "That's why I'm the brains." "And what am I?" "You're the guy in the cape." She rolled up the blueprint. "Get some sleep. Tomorrow, we ruin a child's birthday." Bill paused at the door. "You know we're terrible people, right?" Molly smiled. "Terrible people with a pony fund." "So we *are* getting the pony." "Goodnight, Bill."
created_at
"2026-05-31T11:40:21.937Z"
5/31/2026, 11:40:21 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1077
moderation_latency
(null)
generation_time
22747
tokens_prompt
77
tokens_completion
848
native_tokens_prompt
93
native_tokens_completion
1224
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.031065
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01P975KknSgwhs53uQUDWbW4"
is_byok
false
latency
1077
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227621-sPo7zqAdGRC4mZIqQXoH"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227621-i1evgQTJ0btN0KXeHRA3"
upstream_id
"msg_01P975KknSgwhs53uQUDWbW4"
total_cost
0.031065
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
503
0.00%
Dialogue to Total Word Ratio
Ratio: 84.49%, Deviation: 14.49%
neededClean
false
wordsTotal
503
wordsDialogue
425
49.9609%