NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJJPE4VAWG76391F0FEQBK
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
20.3s
Cost:
$0.028590
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.3562%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the table, smoothing the wrinkles with theatrical flair. It was a child's drawing of a house, complete with a smiling sun and a dog the size of a minivan. "This," she announced, "is the Hendersons' residence. Eight-year-old's birthday party. Saturday at two." Bill squinted at the paper. "Why is there a purple horse in the living room?" "That's a structural pillar." "It has a horn." Molly snatched the drawing away. "Focus, Bill. We're talking about the heist of the century." "The century," Bill repeated flatly. "We're stealing presents from a child." "Not a child. Children. Plural. Do you know how many gifts accumulate at these things? Mountains of them. Untold riches in the form of action figures and craft kits." Bill leaned back, crossing his arms. "Last week you wanted to rob a bank." "And how did that go?" "We got chased out by a teller with a stapler." "Exactly," Molly said, pointing triumphantly. "Banks have security. You know what children's parties have? A clown. One clown, possibly intoxicated, definitely outnumbered." Bill considered this. "The clown could be dangerous." "He makes balloon animals, Bill. The worst he can do is fashion a weiner dog at us." "What's the plan, then?" Molly grinned and produced two paper party hats from her jacket. "We blend in. We arrive as guests. Nobody suspects the adults who brought a gift." "We're bringing a gift?" "An empty box, you fool. The Trojan present. We mingle, we wait for the cake, and when everyone's distracted by the candles—" She snapped her fingers. "We load the loot into the getaway minivan." Bill picked up one of the hats, examining the elastic strap with suspicion. "And how do we explain two grown men leaving with armfuls of presents?" "We say we're the entertainment. Magicians. The disappearing gifts are part of the act." "That's genuinely insane." "It's foolproof." Bill sighed and stretched the elastic under his chin. The hat sat absurdly atop his head, a tiny pink cone of poor decisions. "Fine. But I'm not doing a magic trick." "You'll do a magic trick if the situation demands it." "I don't know any magic tricks." "Pull a coin from behind a kid's ear. It buys us thirty seconds." Bill stood, defeated, and adjusted his hat in the mirror. "We're going to prison over a Nintendo and some Lego." "We're going to prison in *style*," Molly corrected, slapping her own hat on. "Now, what's our exit strategy if things go sideways?" "There's a backdoor through the kitchen, past the snack table." "And if that's blocked?" Bill paused. "We grab a kid as a hostage?" Molly stared at him, horrified. "We are thieves, Bill, not monsters. We grab the piñata. Throw it. Candy goes everywhere. In the chaos, we vanish." "That," Bill admitted, "is actually not terrible." "I have my moments." She rolled up the blueprint and tucked it away. "Saturday. Two o'clock. Bring the empty box and your dignity." "I left my dignity at the bank." "Then just bring the box."
created_at
"2026-05-29T09:55:50.351Z"
5/29/2026, 9:55:50 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1194
moderation_latency
(null)
generation_time
20235
tokens_prompt
77
tokens_completion
808
native_tokens_prompt
93
native_tokens_completion
1125
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02859
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01J38FpCXtZPi9ckzseTaxVs"
is_byok
false
latency
1194
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048550-2wP4iUNbguQSvY3jdIOC"
session_id
(null)
api_type
"completions"
id
"gen-1780048550-1TXOdQm7m128fF1GP47E"
upstream_id
"msg_01J38FpCXtZPi9ckzseTaxVs"
total_cost
0.02859
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
506
0.00%
Dialogue to Total Word Ratio
Ratio: 70.87%, Deviation: 20.87%
neededClean
false
wordsTotal
508
wordsDialogue
360
49.3562%