NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJNRE2R3VJHKZMC4BR6EY3
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
21.8s
Cost:
$0.028515
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
9.61%
Matches word count
(details)
0.20%
Dialogue to Total Word Ratio
(details)
4.9047%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the table, smoothing out the crumpled paper with the reverence of a surgeon preparing for a delicate operation. The diagram, drawn in crayon, depicted a suburban backyard complete with a bouncy castle and what appeared to be a clown. "You drew this?" Bill asked, leaning over her shoulder. "I had limited resources." Bill pointed at a red blob in the corner. "What's that supposed to be?" "That's the magician. He goes on at three, which gives us a fifteen-minute window during the disappearing dove trick." The afternoon sun filtered through the dusty blinds of their hideout, illuminating dust motes that danced above their meticulous planning. Bill scratched his stubble, studying the layout with the intensity of a man who had clearly never planned anything more complex than a sandwich. "The presents will be stacked on the gift table," Molly continued, tapping a brown rectangle. "We grab everything, we're out before the candles are lit." "And the kids?" "What about them?" Bill shifted uncomfortably. "I just feel like stealing from children is, you know, a new low." Molly fixed him with a withering stare. "Last week you stole a man's prosthetic leg." "That was different. He was insured." The plan, as Molly explained it, was deceptively simple. They would arrive disguised as entertainers—she as a face painter, he as the catering staff. While the children gathered for the magic show, Bill would back their van up to the side gate, and Molly would load the loot through the gap in the fence. "What if someone asks me to paint their face?" Bill said. "You're the caterer, Bill. Caterers don't paint faces." "Right. Obviously." He picked up a glittery butterfly stencil from her supply box and examined it. "But hypothetically—" "You paint a butterfly. A simple one. Then you make an excuse and leave." They spent the next hour rehearsing their cover stories, with Bill repeatedly forgetting whether he specialized in finger sandwiches or fairy bread. Molly, meanwhile, perfected a smile so warm and maternal it could melt the suspicions of any helicopter parent. "The trick," she said, applying lipstick in a cracked compact mirror, "is to look like you belong. Children's parties are chaos. Nobody knows who anybody is. We blend in, we strike, we vanish." Bill nodded, though a flicker of doubt crossed his features. "There's just one thing bothering me." "What now?" "These presents." He gestured vaguely. "They're for a six-year-old. It's gonna be glittery socks and dinosaur toys. Where's the profit?" Molly's smile widened into something almost sinister. She reached into her bag and produced a glossy invitation, holding it up like a winning lottery ticket. "Read whose party it is." Bill squinted at the gold embossed lettering. His eyebrows shot toward his hairline as comprehension dawned, the name registering with the force of a slap. "The toy magnate's grandson?" "The very same. Word is the grandfather's gifting him a vintage train set worth more than this entire house." She snapped the invitation shut. "Suddenly the moral high ground's looking a little lonely up there, isn't it?" Bill grinned. "Pass me that butterfly stencil."
created_at
"2026-05-29T09:57:30.7Z"
5/29/2026, 9:57:30 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1249
moderation_latency
(null)
generation_time
21755
tokens_prompt
77
tokens_completion
849
native_tokens_prompt
93
native_tokens_completion
1122
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.028515
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_014ddoMqYAAFP6C9DLsMcfhs"
is_byok
false
latency
1249
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048650-lWCks6zuKbtV3Nn441uD"
session_id
(null)
api_type
"completions"
id
"gen-1780048650-r1FULmHy3kEGWPZepJQh"
upstream_id
"msg_014ddoMqYAAFP6C9DLsMcfhs"
total_cost
0.028515
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
9.61%
Matches word count
n/a
neededClean
false
words
522
0.20%
Dialogue to Total Word Ratio
Ratio: 41.14%, Deviation: 11.14%
neededClean
false
wordsTotal
525
wordsDialogue
216
4.9047%