NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1GJB5C71BP7SRKR238E52Z
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.9s
Cost:
$0.028040
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
14.45%
Dialogue to Total Word Ratio
(details)
56.0371%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the kitchen table, smoothing out the crayon-stained paper with the reverence of a general surveying a battlefield. Bill leaned over her shoulder, chewing the end of a licorice rope he'd liberated from the corner store that morning. "You're sure this is the layout?" he asked, squinting at the wobbly lines. "Drew it myself. Cased the place yesterday during the kid's piano lesson." She tapped a rectangle near the bottom. "Gift table. Right here, by the bouncy castle." Bill frowned. The blueprint was, by any reasonable standard, terrible. The proportions suggested the house was the size of a small nation, and someone—Molly, presumably—had labeled the bathroom "DANGER ZONE" with three exclamation points. "What's the danger zone about?" "Clowns regroup there." He nodded as though this made perfect sense, because in their line of work, it unfortunately did. Last summer's birthday party debacle in Hoboken had taught them both that an angry man in face paint could move with terrifying speed when motivated. Molly produced a second sheet, this one a meticulously timed schedule. "Cake comes out at two. That's our window. Everyone's distracted, singing, eating. Sugar makes children slow and stupid." "That's beautiful," Bill said, genuinely moved. She ignored him. "You handle the loot. I'll create a diversion." "What kind of diversion?" Molly's expression flickered with something he didn't like. He'd seen that look before, right before the incident with the petting zoo. "I'm going to be the magician." Bill set down his licorice. This felt important enough to warrant full attention. "You don't know any magic." "I know how to make presents disappear. That's the only trick that matters." She stood, miming an elaborate flourish with an invisible cape. "Children love a magician. They gather round. They watch the hands. Meanwhile, you're backing the van up to the patio doors and loading every wrapped box you can carry." He had to admit, beneath the absurdity, there was a certain elegant logic to it. There usually was, with Molly. Her plans were ninety percent madness and ten percent inexplicable competence, and somehow that ratio kept them out of prison. "What if a parent recognizes me?" Bill asked. "Wear the panda costume." "We agreed never to discuss the panda costume." "The panda costume," Molly repeated firmly, "is an asset. Nobody questions a panda. A panda has unlimited social credit." Bill rubbed his temples. Somewhere in the apartment, the panda costume waited in a garbage bag, smelling faintly of regret and old cheese puffs. He'd hoped that chapter of his life was closed. "And the parents won't notice their kids' presents getting hauled into a getaway vehicle?" "That's why I'm a magician, Bill." She smiled the smile of a woman who had thought of everything, or at least convinced herself she had. "Misdirection. The greatest weapon known to man." He looked at the blueprint, at the bouncy castle, at the danger zone full of regrouping clowns, and felt the familiar pull of a terrible idea gaining momentum. "Saturday," he sighed. "Saturday," she agreed.
created_at
"2026-06-01T11:54:37.105Z"
6/1/2026, 11:54:37 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1148
moderation_latency
(null)
generation_time
19884
tokens_prompt
77
tokens_completion
827
native_tokens_prompt
93
native_tokens_completion
1103
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02804
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_011qyaqfuoJo4TUgpQKG49Do"
is_byok
false
latency
1148
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780314877-jI5DjrUJgyB5zlSz1RQ2"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780314877-8ifsTJEDUHi0KIZ8BcqK"
upstream_id
"msg_011qyaqfuoJo4TUgpQKG49Do"
total_cost
0.02804
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
507
14.45%
Dialogue to Total Word Ratio
Ratio: 36.22%, Deviation: 6.22%
neededClean
false
wordsTotal
508
wordsDialogue
184
56.0371%