NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJZZQEXF2CY03D58MV5E3J
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
22.3s
Cost:
$0.028240
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.6885%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the hood of their rusted sedan, weighing down the corners with empty energy drink cans. The afternoon sun glinted off her oversized sunglasses, the kind that screamed "suspicious person trying not to look suspicious." "Right," she said, tapping the paper with a chewed pen. "The Hendersons' backyard. Birthday party for little Timmy. Seven years old." Bill leaned in, squinting. "Why are we stealing from a seven-year-old again?" "Because seven-year-olds get the good stuff now. Tablets. Gaming consoles. Premium LEGO sets that cost more than my car payment." She circled an area on the map. "We're talking a high-value target with low security." Bill scratched his stubble. "The security being...?" "A clown named Mr. Giggleworth and approximately fifteen sugar-crazed children." Molly straightened up. "We slip in during the piñata distraction, grab the present table, and we're gone before the cake is cut." A pigeon landed on the car hood, pecking at one of the cans. Bill shooed it away halfheartedly. "I've got concerns," he announced. "You always have concerns." "Last time my 'concern' was that the bank had a vault, and you said—" "That was different." "You said vaults were 'just fancy closets,' Molly." She waved a dismissive hand. "Water under the bridge. This is foolproof. I've even got us costumes." She popped the trunk to reveal two enormous, fuzzy character outfits. One was a purple dinosaur. The other was something that might have been a duck, or possibly a deformed banana. Bill's face fell. "No." "You're the duck." "That's not a duck. That's a war crime in feather form." "It's adorable," Molly insisted, pulling the dinosaur head over her own. Her voice came out muffled and slightly unhinged. "Children love mascots. We walk in, nobody questions a six-foot purple lizard and his pal Quackers." Bill held the costume at arm's length like it was radioactive. "And how exactly do we carry presents while dressed as these abominations?" "The dinosaur tail has a hidden compartment. Genius, right?" She wiggled, demonstrating. The tail flopped sadly against her leg. "We stuff everything in there, waddle out the back gate, and disappear into legend." For a long moment, Bill simply stared at her. The pigeon returned, judging them both. "We're going to prison," he said finally. "Dressed as woodland creatures." "A dinosaur isn't woodland—" "For STEALING from a CHILD." Molly lifted the dinosaur head, her expression suddenly serious. "Bill. Sweetheart. Do you want to keep eating gas station hot dogs for dinner, or do you want to retire to that beach house?" He sighed, the fight draining out of him. He'd heard the beach house speech approximately forty times. There was never going to be a beach house. There was only ever the next ridiculous scheme. "Fine," he muttered, yanking the duck costume from her hands. "But if a kid hugs me, I'm walking." "That's the spirit." Molly slammed the trunk shut, grinning beneath her plastic dinosaur grin. "Now let's go ruin a birthday."
created_at
"2026-05-29T10:03:05.849Z"
5/29/2026, 10:03:05 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2050
moderation_latency
(null)
generation_time
22238
tokens_prompt
77
tokens_completion
807
native_tokens_prompt
93
native_tokens_completion
1111
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02824
router
(null)
provider_responses
0
endpoint_id
"dfc0e5bd-d703-4fe2-a7bb-655eb95d5441"
id
"msg_019QUjEA8jVopBmR7fjvnLdo"
is_byok
false
latency
2050
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048985-QbSgWHSK8PqdqITnuqwE"
session_id
(null)
api_type
"completions"
id
"gen-1780048985-fhcKye7y1ab3aPGDFxP0"
upstream_id
"msg_019QUjEA8jVopBmR7fjvnLdo"
total_cost
0.02824
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
495
0.00%
Dialogue to Total Word Ratio
Ratio: 52.79%, Deviation: 22.79%
neededClean
false
wordsTotal
502
wordsDialogue
265
49.6885%