NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJKVF2RX0D4A76V3T4XPDT
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
21.1s
Cost:
$0.030165
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.04%
Dialogue to Total Word Ratio
(details)
49.8945%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the table, smoothing out a corner that smelled faintly of apple juice. "Okay, Bill, walk me through it one more time." "We've been through it eleven times." "Twelve's the charm. Go." Bill sighed, pointing a stubby finger at the diagram. "We enter through the back gate at fourteen-hundred hours, disguised as clowns." "Why clowns?" "Because, Molly, nobody questions a clown at a child's birthday party. We're practically invisible." "Clowns are terrifying. A child will scream." "Children always scream. It's their natural state. The screaming is our cover." Molly tapped the map. "And the presents?" "Stacked on the gift table, here, by the bouncy castle. We load them into the bounce house, deflate it, and roll the whole thing out like a giant burrito." "A present burrito." "A lucrative present burrito." She leaned back, arms crossed. "What about the parents?" "What about them?" "They tend to notice when forty wrapped gifts go missing, Bill." "Which is why," he said, producing a second drawing with a flourish, "we deploy the distraction. A piñata." "A piñata." "Filled not with candy, but with cash. We crack it open mid-heist, money rains down, total chaos. While the adults scramble for twenties, we slip out the back." Molly stared at him. "You want to *give* them money to *steal* their presents?" "It's called an investment." "It's called insane." "Crime is a numbers game, Molly. Those presents are worth, conservatively, eight hundred dollars. We seed the piñata with two hundred. We profit six." "Minus the cost of the bounce house." "We rent it." "You rented a getaway bounce house?" "Under a fake name. Bobo the Magnificent." Molly pinched the bridge of her nose. "And how do we get this deflated bounce house full of stolen toys out of a suburban backyard?" "The ice cream truck." "We don't have an ice cream truck." "We will by Saturday." Bill grinned. "I'm acquiring one." "Acquiring." "Borrowing. Indefinitely. From a man named Sal who owes me forty dollars." Molly was quiet for a long moment. "Bill, do you ever think about how we used to rob banks?" "Banks have cameras, Molly. Banks have guards. You know what a children's party has? Juice boxes and a fundamental misunderstanding of perimeter security." "It's a six-year-old's birthday." "Six-year-olds have rich parents. Rich parents buy iPads, gold-plated whatevers, and that big LEGO set, the Death Star, retails for nine hundred dollars *alone*." She paused. "The Death Star's there?" "My sources confirm it." "...Your sources being?" "The Evite. I hacked the Evite." Molly slowly pulled the map back toward herself, studying it with fresh eyes. "We'd need a third clown." "I knew you'd come around." "For the bounce castle. One person can't roll it alone." "I'll call Carl." "Carl's banned from children's parties." Bill blinked. "Since when?" "Since the puppet incident." "That was never proven." He stood, stretching. "Fine. We'll use Janet." "Janet's allergic to face paint." "Then she can be the *quiet* clown." Molly smiled despite herself. "I hate that this might work."
created_at
"2026-05-29T09:56:28.267Z"
5/29/2026, 9:56:28 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1119
moderation_latency
(null)
generation_time
21094
tokens_prompt
77
tokens_completion
867
native_tokens_prompt
93
native_tokens_completion
1188
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.030165
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01CJ7tYqiZJ9PkSeMhKWt4ir"
is_byok
false
latency
1119
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048588-i0x7KT4mrLThnXnm1fB5"
session_id
(null)
api_type
"completions"
id
"gen-1780048588-aG3JBLcoMnezuXZEWvw4"
upstream_id
"msg_01CJ7tYqiZJ9PkSeMhKWt4ir"
total_cost
0.030165
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
504
0.04%
Dialogue to Total Word Ratio
Ratio: 82.42%, Deviation: 12.42%
neededClean
false
wordsTotal
512
wordsDialogue
422
49.8945%