NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJHA7N0D26WA9HTSH621HP
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
21.0s
Cost:
$0.028765
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9921%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the table, smoothing out the crayon drawings with a frown. "This is the layout?" Bill asked, leaning in. "Why does the living room have a unicorn in it?" "That's a structural pillar. Tommy drew it for us." "Tommy? Our inside man is seven years old?" "He's our inside man because he's seven," Molly said. "Nobody suspects a kid. He'll unlock the back door during the piƱata." Bill rubbed his temples. "Walk me through it. What are we dealing with?" "Forty children. One bounce house. Approximately sixty wrapped presents." "Sixty?" Bill whistled. "What kind of kid gets sixty presents?" "A spoiled one named Madison whose father owns a car dealership," Molly said. "We hit the pile during cake time. Everyone's distracted by the candles." "And security?" "One clown. Goes by Mr. Wiggles." Bill paused. "Mr. Wiggles is the only thing standing between us and sixty presents?" "Don't underestimate him. He does balloon animals and karate. I've seen the flyer." "A karate clown." Bill sat back. "This is the worst job we've ever planned." "It's foolproof," Molly insisted. "We go in dressed as parents. Nobody questions a parent. We mingle, we wait, we strike." "I don't have kids. I don't know how to be a parent." "Just complain about the school district and look exhausted." Bill nodded slowly. "I can do tired. What's the getaway?" "Minivan. I rented one. There's a juice box in the cupholder for authenticity." "You thought of everything." "That's why I'm the brains," Molly said. "Now, the presents go out the back, into the van, and we're gone before they finish singing Happy Birthday." "What if they sing fast?" "Nobody sings Happy Birthday fast, Bill. It's the slowest song in human history. We've got ninety seconds, minimum." Bill scribbled something on a napkin. "And if Mr. Wiggles makes a move?" "Then you handle Mr. Wiggles." "Why me?" "Because you're the muscle, and I'm allergic to clowns." "That's not a real allergy." "It's emotional, and it counts." Molly stabbed a finger at the blueprint. "Tommy says the gift table is here, next to the unicorn pillar. We load fast, no hesitating." "What about the parents?" "They'll be on their phones. Trust me, no adult at a kid's party is paying attention to anything." Bill grinned. "You know, this might actually work." "Of course it'll work. The hard part is the cake." "What about the cake?" "We're stealing that too." Bill blinked. "We're stealing a child's birthday cake?" "It's a four-tier custom cake shaped like a castle. Worth three hundred dollars. We'd be idiots to leave it." "Molly, there's a line, and I think a birthday cake is on the other side of it." "There are no lines in this business, Bill. Only opportunities." She rolled up the blueprints. "Now finish your coffee. We've got a bounce house to infiltrate." Bill stared at his cup. "I never thought my life would come to this." "Nobody does," Molly said. "That's what makes us great."
created_at
"2026-05-29T09:55:05.089Z"
5/29/2026, 9:55:05 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2329
moderation_latency
(null)
generation_time
20991
tokens_prompt
77
tokens_completion
808
native_tokens_prompt
93
native_tokens_completion
1132
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.028765
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01UpyQhsWbhz46zrRzVwyn3m"
is_byok
false
latency
2191
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048505-AtOonfXtgvf9079mVek7"
session_id
(null)
api_type
"completions"
id
"gen-1780048505-8ZEG2hBcqdRy1qSKdVoz"
upstream_id
"msg_01UpyQhsWbhz46zrRzVwyn3m"
total_cost
0.028765
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
498
0.00%
Dialogue to Total Word Ratio
Ratio: 85.97%, Deviation: 15.97%
neededClean
false
wordsTotal
499
wordsDialogue
429
49.9921%