NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXDQSQGHMHZW7SEVK940T8
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
20.1s
Cost:
$0.028990
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8722%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the kitchen table, smoothing out the wrinkles with the reverence of a surgeon prepping for an operation. Bill leaned over her shoulder, chewing a toothpick. "Okay," Molly said, tapping the paper. "The Hendersons' backyard. Twelve children, two clowns, and a bouncy castle. Saturday at noon." "A bouncy castle," Bill repeated. "We didn't account for a bouncy castle." "We don't need to account for the bouncy castle, Bill." "What if it's load-bearing?" Molly looked up at him slowly. "It's an inflatable. It bears the load of toddlers." "Toddlers are heavier than you'd think," Bill said. "I read an article." "You don't read articles." "I saw a headline." Molly pinched the bridge of her nose. "Can we focus? The presents will be stacked on the gift table here, by the patio. We go in dressed as entertainers, we grab the haul during cake time, and we're gone before anyone sings the second verse of Happy Birthday." Bill straightened up. "There's only one verse." "There's the 'how old are you now' part." "That's not a verse, that's a bonus round." "Bill." "I'm just saying, if we're timing this to a song, we should know how long the song is." Molly slid a costume across the table at him. It was a magician's cape, slightly moth-eaten, with sequins missing in the shape of what was once a rabbit. Bill held it up and frowned. "Why am I the magician?" "Because you have the hands for it." "What does that even mean?" "It means you have shifty, untrustworthy hands," Molly said. "Perfect for sleight of hand and grabbing wrapped electronics." Bill considered this, nodding. "That's actually a compliment." "Don't get emotional." He draped the cape over his shoulders and posed. "And what are you going to be?" "The face painter. Nobody suspects the face painter. I'll be stationed right next to the gifts the entire time, and I'll have a perfectly good reason to be holding a bag full of supplies." "Supplies being—" "The presents. Obviously." Bill paused. "Won't the kids want their faces painted?" "Then I'll paint their faces, Bill. I'm not a monster." "You're stealing a six-year-old's birthday presents." "I'm stealing twelve six-year-olds' birthday presents," Molly corrected. "There's a difference. It's called scale." Bill sat down across from her and lowered his voice, suddenly serious. "Molly. Have you thought about what we're doing here? These are children. Innocent children. With dreams." Molly stared at him. "Are you having a conscience?" "A small one. Like an appendix. It flares up occasionally." "Bill, last week you stole a tip jar from a hospice." "And I felt terrible about it for almost an hour." He gestured vaguely. "This feels different." Molly folded the blueprints with a sigh. "Fine. We donate ten percent." "Twenty." "Fifteen, and you wear the rabbit ears." Bill extended his shifty, untrustworthy hand across the table. "Deal." They shook on it, sealing the worst-planned, most morally bankrupt heist either of them had ever attempted—right there over a half-eaten plate of toast.
created_at
"2026-05-31T11:41:34.408Z"
5/31/2026, 11:41:34 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1052
moderation_latency
(null)
generation_time
20020
tokens_prompt
77
tokens_completion
832
native_tokens_prompt
93
native_tokens_completion
1141
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02899
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01PXRCUTBprYVVgUXaZPsc3T"
is_byok
false
latency
1052
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227694-lGByatOxjnTIa3PNWh9F"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227694-JkL0wRdej24wzt0MIjvM"
upstream_id
"msg_01PXRCUTBprYVVgUXaZPsc3T"
total_cost
0.02899
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
504
0.00%
Dialogue to Total Word Ratio
Ratio: 66.60%, Deviation: 16.60%
neededClean
false
wordsTotal
512
wordsDialogue
341
49.8722%