NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1H52462W526VZYXFWR6AYV
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
21.6s
Cost:
$0.029965
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
93.65%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
46.8248%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the kitchen table, smoothing out the crumpled corners with the reverence of a surgeon prepping for transplant. "Behold," she said. "The Hendersons' backyard. Site of tomorrow's heist of the century." Bill leaned over, squinting. "This is a napkin." "It's a *tactical* napkin." "You drew a bouncy castle in crayon." "That bouncy castle," Molly said, jabbing the orange smudge, "is our primary obstacle. Eight-year-olds, Bill. Dozens of them. Hopped up on sugar and rage." Bill rubbed his temples. "Remind me why we're robbing a child's birthday party?" "Because nobody expects it. The presents are unguarded, beautifully wrapped, and pre-sorted by value. Rich kids get expensive toys. It's a treasure pile, Bill. A glittering, ribbon-covered treasure pile." "And the parents?" "Distracted. There's a magician coming at two." She tapped the napkin again. "We strike during the disappearing dove trick. Maximum chaos, minimum supervision." Bill picked up a crayon, examining it like evidence. "Okay. Say I'm in. What's my role?" "You're the clown." There was a long, terrible silence. "I'm sorry," Bill said slowly. "The *what*?" "The clown. We need someone to blend in. A clown can walk anywhere at a kids' party. Nobody questions a clown." "Everybody questions a clown! Clowns are inherently suspicious!" "Exactly. So they'll be looking at your suspicious face and not at me, sneaking out the back with a sack of Nintendo Switches." Bill stood up, pacing. "I have a mustache, Molly. A full, magnificent mustache." "Clowns can have mustaches." "Name one!" She opened her mouth, then closed it. "That's not the point. The point is the costume covers everything. White face paint, big shoes, the wig—" "I am *not* wearing the wig." "You're wearing the wig." "My head is a very specific size, Molly. Wigs don't fit. They perch. I'll look like an egg wearing a toupee." Molly stood, matching his energy. "Do you want the loot or not? There's a kid named Tobias whose parents bought him a literal pony. A *pony*, Bill. We can't carry a pony, but the point stands. These people have money to burn and a complete absence of security." Bill stopped pacing. "How are we transporting everything?" "Ice cream truck." "We don't own an ice cream truck." "We will by tomorrow morning." She grinned. "I've already located one. Owner's named Dave. Dave owes me a favor." "Everyone owes you a favor. It's deeply concerning." Bill sat back down, defeated, and stared at the crayon castle. "Fine. I'm the clown. But I have conditions." "Name them." "One, I keep any candy I find. Two, if a single child cries, we abort. I'm a thief, not a monster." He paused. "Three, if anyone asks, my clown name is Sir Reginald Chuckletrousers." Molly extended her hand. "Deal, Sir Reginald." They shook. "This is going to be a disaster," Bill said. "Absolutely," Molly agreed cheerfully, rolling up the napkin. "But a *profitable* one. Now go practice your balloon animals. You're rusty."
created_at
"2026-06-01T12:04:50.472Z"
6/1/2026, 12:04:50 PM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1112
moderation_latency
(null)
generation_time
21520
tokens_prompt
77
tokens_completion
809
native_tokens_prompt
93
native_tokens_completion
1180
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.029965
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01Lu2WEdQcSbXk37u572JRok"
is_byok
false
latency
1090
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780315490-kbEQSMIOrCxX4Czm5ZCd"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780315490-acItQYCmCT033j6hjWkP"
upstream_id
"msg_01Lu2WEdQcSbXk37u572JRok"
total_cost
0.029965
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
93.65%
Matches word count
n/a
neededClean
false
words
491
0.00%
Dialogue to Total Word Ratio
Ratio: 77.73%, Deviation: 27.73%
neededClean
false
wordsTotal
494
wordsDialogue
384
46.8248%