NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYX810A7B0ZAAJFESVE7CTJ
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
21.3s
Cost:
$0.028315
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
43.1921%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the table, smoothing out the crayon-stained corners. "Okay, so the target is the Henderson residence. Birthday party starts at two. We go in at two-thirty, peak chaos." Bill leaned over, squinting. "Is this drawn on the back of a coloring book?" "It's all I had. Focus, Bill." "There's a unicorn on the kitchen." "That's load-bearing intel," Molly snapped. "The presents will be stacked here, by the gift table, near the snacks." Bill scratched his chin. "And how do we get past the parents?" "Simple. We blend in. We become party guests." "We're grown adults, Molly. The oldest kid there will be, what, eight?" "That's why you're going as a clown." Bill recoiled. "Absolutely not." "You'd make a fantastic clown." "I have a documented fear of balloons." "You have a fear of balloons but not of breaking and entering?" "Balloons are unpredictable!" Bill said, jabbing a finger at her. "A safe just sits there. A balloon could pop at any moment. It's psychological warfare." Molly pinched the bridge of her nose. "Fine. You're the magician. You do the misdirection, I grab the loot." "Magician I can do. I've got the cape and everything." "You own a cape?" "For occasions," Bill said vaguely. "What's the haul looking like?" Molly's eyes gleamed. "Word is the birthday boy is turning seven, and his parents are loaded. We're talking premium toys. Limited edition action figures. Possibly a gaming console." "A console?" Bill whistled. "Now we're talking real money." "That's why I need you sharp. The moment those kids start the piƱata, that's our window. Everyone's distracted, swinging blindfolded. We sweep the gift table and exit through the back gate." Bill nodded slowly. "And if a kid catches us?" "You do a magic trick. Pull a coin from their ear. Buy us thirty seconds." "I can't pull a coin from anyone's ear, Molly. I never learned the trick. I just bought the cape." Molly stared at him. "You bought the cape but not the skills?" "The cape was on sale!" "You are the worst criminal partner I have ever had." "You said that about the bank job too, and we got away clean." "We got away because the guard fell asleep, not because of your so-called plan." Bill folded his arms. "A win's a win. So what's our getaway vehicle?" Molly hesitated. "The ice cream truck." "You're joking." "Nobody suspects an ice cream truck. We blast the music, kids come running, we drive off slow and majestic." "Slow being the operative word. Those things top out at twelve miles an hour." "We won't need speed. We'll need cover, and there's no better cover than a man in a cape handing out free fudge pops." Bill considered this, then grinned. "Okay. I'm in. But I want it on record that I objected to the clown thing." "Noted. Now grab your cape, Bill." "What about masks?" Molly slid a paper plate across the table with two eyeholes cut into it. Bill sighed. "We're going to prison, aren't we?" "Probably. But we'll go in style."
created_at
"2026-05-31T11:38:27.463Z"
5/31/2026, 11:38:27 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1075
moderation_latency
(null)
generation_time
20981
tokens_prompt
77
tokens_completion
811
native_tokens_prompt
93
native_tokens_completion
1114
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.028315
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_019aFdWutyUskRoBsq5KKtT3"
is_byok
false
latency
1075
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227507-MF4YFHab8YDSsTYXPk5e"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227507-wHMiAmDpMGVyLLFtpj0K"
upstream_id
"msg_019aFdWutyUskRoBsq5KKtT3"
total_cost
0.028315
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
511
0.00%
Dialogue to Total Word Ratio
Ratio: 84.27%, Deviation: 14.27%
neededClean
false
wordsTotal
515
wordsDialogue
434
43.1921%