NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYX9QBN3Y1VCXJ1RAA4DTXZ
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.7s
Cost:
$0.027440
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the table, smoothing out the wrinkles with the reverence of a general planning an invasion. Bill leaned over her shoulder, squinting at the wobbly lines and the glittery sticker that marked their target. "Why is there a unicorn on the front door?" he asked. "That's the security system." "It's a sticker, Molly." "It's a *deterrent*, Bill. Now focus." The map detailed every inch of the Hendersons' backyard, where in approximately three hours, little Timmy Henderson would be turning seven, surrounded by a fortune in unopened gifts. Molly had cased the joint for a week, posing as a balloon delivery woman, which explained the helium tank still wheezing in the corner of their apartment. Bill tapped the table. "Walk me through it again." "Okay. The party starts at two. By two-thirty, sugar levels peak. Total chaos. Kids screaming, parents distracted, somebody's always crying in the bathroom." She traced a path with her finger. "We come in through the side gate dressed as clowns—" "I'm not being a clown." "You're being a clown." "Clowns are terrifying, Molly. Children will scream." "Children are *already* screaming, that's the whole plan." She pulled out two red noses from her jacket pocket and slapped one against his chest. "The presents get stacked on the gift table near the bouncy castle. We swoop in during the piñata frenzy, load everything into the burrito cart, and roll out the back." Bill picked up the nose, examining it like evidence at a crime scene. "And the burrito cart is...?" "Our extraction vehicle. Nobody questions a man pushing free burritos. It's the perfect disguise." He had to admit, grudgingly, that it had a certain demented logic. He'd worked with Molly for six years, ever since the Great Dollhouse Caper of '19, and her plans always teetered on the edge between genius and arrest. So far, they'd stayed on the right side. "What about the parents?" he asked. "Suburban dads. They'll be huddled around the grill, talking about lawnmowers and their fantasy football leagues. Completely useless in a crisis." Molly straightened up, eyes gleaming. "The real threat is Aunt Carol. She's got eyes in the back of her head and she counts the gifts." "She counts them?" "Twice. I watched her do it." Bill set down the nose with a sigh. "Molly, these are *children's* presents. We're talking about, what, action figures? Coloring books? A bike if we're lucky?" "It's not about the loot, Bill." She placed a hand over her heart, suddenly solemn. "It's about the *principle*. The thrill. The artistry of the heist itself." "It's about the bike, isn't it?" She held his gaze for a long moment. "It's a *very* nice bike. Streamers and everything." Bill stared at the map, the ridiculous unicorn sticker, the helium tank, and the two clown noses sitting between them like a dare. Against every instinct he possessed, he felt himself starting to grin. "Fine," he said, reaching for his nose. "But I'm not eating any of the burritos."
created_at
"2026-05-31T11:39:22.881Z"
5/31/2026, 11:39:22 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
963
moderation_latency
(null)
generation_time
19695
tokens_prompt
77
tokens_completion
794
native_tokens_prompt
93
native_tokens_completion
1079
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02744
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01QQXFkvGVDxUn2CHScpwfKV"
is_byok
false
latency
963
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227562-WwBUbev90jTr4CRMkmvo"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227562-UYTJgvssTn6p2lKyWu7d"
upstream_id
"msg_01QQXFkvGVDxUn2CHScpwfKV"
total_cost
0.02744
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
0.00%
Dialogue to Total Word Ratio
Ratio: 55.95%, Deviation: 25.95%
neededClean
false
wordsTotal
504
wordsDialogue
282
49.9920%