NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1G3G46S7DNY1ZN67VC61Y5
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
20.5s
Cost:
$0.028340
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
3.24%
Dialogue to Total Word Ratio
(details)
51.3095%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the hood of their getaway sedan, a 1987 Buick that smelled perpetually of regret and stale french fries. She'd lifted the map from a recycling bin outside the Hendersons' house, and it featured a remarkably detailed rendering of where the bouncy castle would be. "Tell me again why we're robbing a six-year-old," Bill said, adjusting his ski mask. It was bright yellow, the only one left at the dollar store, and made him look less like a criminal and more like a sad banana. "Because nobody suspects the children's party circuit," Molly replied, tracing a finger along the map. "Think about it. Cops are watching banks, jewelry stores, art galleries. Meanwhile, there's a goldmine of unopened gifts sitting on a folding table next to the juice boxes." Bill frowned, which was hard to see beneath the banana mask. "What kind of loot are we talking? Friendship bracelets? A stuffed giraffe?" "This is the Patterson kid's party. His grandfather invented a kind of screw. They're loaded." Molly pulled a clipboard from the backseat and clicked her pen with professional menace. "I'm talking premium gifts. Remote control cars. Maybe a tablet. There's serious resale value in lightly-used educational toys." They crouched behind a hedge as the party raged twenty feet away. A clown named Mr. Giggles was failing to entertain a hostile crowd of first-graders, his balloon animals deflating under the weight of their judgment. "Okay, here's the plan," Molly whispered. "We wait for the piñata. That's peak chaos. Every kid will be swarming for candy, the parents will be filming on their phones, and we walk straight to the gift table." "And if someone stops us?" "We're entertainers. I'll juggle. You distract them with that face." Bill touched his mask defensively. "It was the last one." The piñata moment arrived. A papier-mâché unicorn swung from the oak tree, and a blindfolded child swung a plastic bat with the fury of a tiny berserker. Candy exploded across the lawn. The crowd descended. "Now," Molly hissed. They strode across the grass with the confidence of people who absolutely belonged there. The gift table came into view, gloriously stacked with shiny packages. That's when a small girl in a tutu planted herself directly in their path, arms crossed, eyes narrowed like a tiny IRS auditor. "Who are you?" she demanded. Bill froze. Molly improvised. "We're the present inspectors. Standard procedure. Gotta check the gifts for, uh, safety hazards." The girl studied them for an agonizing moment. Then she pointed at Bill. "Why is he a banana?" "He's the banana inspector," Molly said smoothly. This satisfied no one, least of all the girl, who turned and screamed, "MOM! BANANA MAN IS STEALING TIMMY'S PRESENTS!" What followed was the single most undignified retreat of Molly and Bill's combined criminal careers. They fled across the lawn pursued by a horde of sugar-fueled children, one furious mother, and a clown who, frankly, seemed grateful for the distraction. They escaped with exactly one gift: a pair of socks.
created_at
"2026-06-01T11:46:30.669Z"
6/1/2026, 11:46:30 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1041
moderation_latency
174
generation_time
20242
tokens_prompt
77
tokens_completion
806
native_tokens_prompt
93
native_tokens_completion
1115
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02834
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_0187vuq2xZUc991oHogZ7xPN"
is_byok
false
latency
1041
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780314390-s2VpE8BGirUpF4CGoxjK"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780314390-byp2ZL3RXGlQJVdhWS85"
upstream_id
"msg_0187vuq2xZUc991oHogZ7xPN"
total_cost
0.02834
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
505
3.24%
Dialogue to Total Word Ratio
Ratio: 38.28%, Deviation: 8.28%
neededClean
false
wordsTotal
512
wordsDialogue
196
51.3095%