NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSK5R7YP6YRTYZK9VWSZMWB
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.6s
Cost:
$0.027640
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
97.85%
Dialogue to Total Word Ratio
(details)
98.8865%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the hood of the rusted Buick, smoothing its crinkled edges with the reverence of a general surveying a battlefield. The afternoon sun glinted off her oversized sunglasses, which she refused to remove despite being parked in a shadowy alley. "Okay, Bill, focus," she said, jabbing a finger at the paper. "The Hendersons' backyard. Saturday. Fourteen hundred hours sharp." Bill leaned in, squinting. He had a habit of breathing through his mouth when he concentrated, which made him sound like an asthmatic walrus. "This is just a drawing of a bouncy castle," he said. "It's a *layout*. The bouncy castle is here, the snack table is here, and the present pile—" she tapped dramatically, "—is right by the gazebo." Bill scratched his stubbled chin. He'd stolen jewelry, cars, and once, memorably, an entire vending machine. But this felt different. This felt wrong in a way he couldn't quite articulate. "They're seven-year-olds, Molly." "They're *spoiled* seven-year-olds with rich parents. Do you know what kids get these days? iPads. Drones. One of those little electric cars that cost more than my apartment." He had to admit, the electric car was tempting. The plan, as Molly explained it over the next twenty minutes, was elegant in its insanity. Bill would dress as a clown—a profession universally trusted yet universally ignored—and create a distraction near the bouncy castle. Meanwhile, Molly would slip behind the gazebo dressed as a caterer, load the presents into oversized cake boxes, and wheel them out to the van before anyone noticed the cake had legs. "I'm not wearing the clown suit," Bill said. "You already agreed to the clown suit." "I agreed to *a* costume. I assumed it would be dignified." Molly pulled the rubber nose from her jacket pocket and held it up like an interrogator presenting evidence. Bill stared at it the way a man stares at his own mortality. "Fine," he muttered, snatching it. "But if I get hit with a water balloon, I'm walking." "You won't get hit with a water balloon." "Children are unpredictable, Molly. They're tiny anarchists with sugar in their veins." She rolled her eyes, folding the blueprint with crisp efficiency. There was a beauty to the simplicity of it all. No lasers, no vaults, no armed guards. Just frosting, balloon animals, and the trusting naivety of suburban parents who thought a clown was a clown. Bill turned the rubber nose over in his hands, a strange melancholy settling over him. Somewhere in the back of his criminal heart, a tiny voice questioned whether stealing birthday presents crossed a moral line that even thieves shouldn't cross. Then he remembered the electric car. "What time do we get there?" he asked. Molly grinned, sliding her sunglasses down to reveal eyes sparkling with greed and showmanship. She clapped him hard on the shoulder, nearly knocking the nose from his grip. "Now you're thinking," she said. "Suit up, Bozo. We've got a party to ruin."
created_at
"2026-05-29T10:06:14.789Z"
5/29/2026, 10:06:14 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1353
moderation_latency
(null)
generation_time
19604
tokens_prompt
77
tokens_completion
785
native_tokens_prompt
93
native_tokens_completion
1087
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02764
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_015TK3rhkxEeqa7Zghcowdt6"
is_byok
false
latency
1353
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780049174-RsWWzjYhLK0ZTUcu1kNu"
session_id
(null)
api_type
"completions"
id
"gen-1780049174-5PPL54Vdp2o6WTAq1rqI"
upstream_id
"msg_015TK3rhkxEeqa7Zghcowdt6"
total_cost
0.02764
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
497
97.85%
Dialogue to Total Word Ratio
Ratio: 29.34%, Deviation: 0.66%
neededClean
false
wordsTotal
501
wordsDialogue
147
98.8865%