NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJM0VCPHYTYPN1QQ1SFM0Y
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
20.2s
Cost:
$0.027565
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
3.72%
Dialogue to Total Word Ratio
(details)
51.7338%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the kitchen table, smoothing out the wrinkles with theatrical precision. "Behold," she said. "The Hendersons' backyard. Ground zero for Operation Birthday Bonanza." Bill leaned in, squinting. "Molly, this is a drawing of a square with the word 'GRASS' written in it." "It's a tactical overview." "You drew a sun in the corner. With sunglasses." "The sun is happy because we're about to be rich." She tapped the paper. "Now focus. The party starts at two. Twenty kids, minimum. You know what twenty kids means?" "Trauma?" "Presents, Bill. Mountains of presents. Action figures. Building blocks. Those little ponies with the rainbow hair that sell for a fortune online." Bill crossed his arms. "I'm not stealing toys from a child." "We're not stealing from a child. We're stealing from twenty children. It's a volume business." "That's somehow worse." "It's economies of scale." Molly pulled out a clown costume from a duffel bag and tossed it at him. "Put this on. You're the entertainment." Bill held up the costume, horrified. "Why am I the clown?" "Because I'm the magician, and the magician needs an assistant who can carry a sack." "I have a degree in mechanical engineering." "And yet here we are." She snapped the suspenders against his chest. "While I dazzle the parents with card tricks, you'll be loading the loot into a balloon-animal bag. Nobody suspects a clown holding balloons." "Everybody suspects a clown. Clowns are inherently suspicious." "Which is the genius of it. They'll be too scared to look closely." Molly slipped a pack of cards into her sleeve. "We're in, we're out, we're gone before the cake is cut." Bill sighed, stepping into the costume. "And how exactly do we get past the parents?" "Simple. I distract them with my world-famous disappearing dove." "You don't have a dove." "That's the trick. It already disappeared." She grinned. "Now, the presents. Where do people put presents at a kid's party?" "On a table. Usually by the door." "Exactly. Right next to the exit. It's like they're begging us." She zipped up the bag. "We waltz in, you honk a few times, I pull a quarter out of a kid's ear, and we leave with three hundred dollars of merchandise." Bill adjusted his oversized shoes, nearly tipping over. "What if a kid recognizes their own gift?" "Then you give them a balloon shaped like a dog." "I don't know how to make a balloon dog." "Make a balloon snake. It's just a balloon." Bill stared at his reflection in the toaster, defeated. "I never thought my life would come to this. A grown man in a clown wig, robbing toddlers." Molly clapped him on the shoulder. "That's the spirit. Now remember the code word if things go sideways." "What's the code word?" "'Pizza.'" "That's the same word we used last time, and you ordered an actual pizza." "And it was delicious." She slung the bag over her shoulder. "Let's go ruin some childhoods."
created_at
"2026-05-29T09:56:33.783Z"
5/29/2026, 9:56:33 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1059
moderation_latency
(null)
generation_time
20134
tokens_prompt
77
tokens_completion
786
native_tokens_prompt
93
native_tokens_completion
1084
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.027565
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01Q8arpXHWQcNSQpn4PF5t5G"
is_byok
false
latency
1059
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048593-gE6ruHu56gJ5Bu4gITNT"
session_id
(null)
api_type
"completions"
id
"gen-1780048593-JqMsqYnnUMaQ0KO9Y2b5"
upstream_id
"msg_01Q8arpXHWQcNSQpn4PF5t5G"
total_cost
0.027565
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
496
3.72%
Dialogue to Total Word Ratio
Ratio: 78.11%, Deviation: 8.11%
neededClean
false
wordsTotal
498
wordsDialogue
389
51.7338%