NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXQRGB7MJQ455797Z16FRM
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
19.1s
Cost:
$0.028340
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.6885%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the kitchen table, smoothing the wrinkled paper with both hands. Bill leaned over her shoulder, chewing the end of a licorice rope. "So this is the target," she said, tapping a hand-drawn square. "The Hendersons' backyard. Tommy Henderson turns seven on Saturday." "Seven," Bill repeated, impressed. "That's a big one. Lots of presents." "Exactly. My intel says he's got forty kids coming. Forty kids means forty gifts, minimum." Bill straightened up, a strand of licorice dangling from his lips. "What's the security situation?" "Brutal," Molly admitted. "Two parents, one grandmother, and a clown." "A clown?" Bill's face went pale. "Molly, you didn't say anything about a clown." "Relax. It's a birthday clown, not a hitman." "Same thing." He shuddered. "What's our entry point?" Molly traced a route along the diagram. "We come in through the side gate during the magic show. Everyone'll be distracted watching the guy pull rabbits out of his hat." "And the presents?" "Piled on the gift table here." She circled it twice. "We grab them, load up the wagon, and we're gone before they finish singing Happy Birthday." Bill frowned. "We're using a wagon?" "A little red one. Blends in. Nobody suspects a wagon at a kid's party." "That's actually genius," he conceded, finally swallowing the licorice. "But what about the cake? Can we grab the cake too?" "No, Bill. We're professionals. We don't take cake." "A professional would take the cake." Molly pinched the bridge of her nose. "We are stealing children's birthday presents. Let's not pretend we have standards." "Fair point." Bill picked up a juice box from the counter and stabbed the straw in. "What if a kid sees us?" "Then we improvise. You're good with kids." "I am not good with kids. The last kid I met called me a 'weird old man' and kicked me in the shin." "That was your nephew." "Doesn't make it hurt less," Bill muttered, sipping his juice. "What's our exit strategy if it all goes sideways?" Molly pulled out a second sheet of paper. "We've got three exits. Front gate, back fence, and—" "And?" "The bounce house." Bill set his juice down slowly. "You want us to escape through a bounce house?" "Through it, not in it. There's a gap behind it that leads straight to the alley." She crossed her arms triumphantly. "Foolproof." "Nothing involving a bounce house is foolproof." He paced the small kitchen. "And what's our cut? How do we even fence a bunch of action figures and friendship bracelets?" "Online marketplace. Lightly used toys go for a fortune." "Lightly used?" Bill laughed. "They won't even be opened." "Even better. Mint condition." Molly rolled up the blueprints with a satisfying snap. "So are you in or what?" Bill stared at the rolled-up plans, then at the half-finished juice box, then back at Molly. "On one condition," he said finally. "What?" "We take the cake." Molly sighed. "We take the cake." "Then I'm in."
created_at
"2026-05-31T11:47:02.806Z"
5/31/2026, 11:47:02 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1140
moderation_latency
(null)
generation_time
18262
tokens_prompt
77
tokens_completion
800
native_tokens_prompt
93
native_tokens_completion
1115
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02834
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01BHvyVh4GpQKbbGxdEquFMK"
is_byok
false
latency
1140
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780228023-jcZRBaqn916OFtuQbtIu"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780228023-PsUGmKT1IDap9zWoMG9e"
upstream_id
"msg_01BHvyVh4GpQKbbGxdEquFMK"
total_cost
0.02834
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
495
0.00%
Dialogue to Total Word Ratio
Ratio: 68.07%, Deviation: 18.07%
neededClean
false
wordsTotal
498
wordsDialogue
339
49.6885%