NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJVMXECTQQDV0W3YR2QA52
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
22.2s
Cost:
$0.028315
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8722%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pressed her face against the chain-link fence, surveying the suburban battlefield. Streamers sagged from a carport. A bounce house wobbled ominously in the breeze, half-inflated, like a deflating ambition. "This is the place," she whispered, adjusting her ski mask, which she'd accessorized with a small party hat for reasons Bill had stopped questioning years ago. Bill consulted a crumpled flyer he'd stolen from a community bulletin board. "Timmy's seventh birthday. Magician at two, cake at three, present opening at four." He tapped the paper. "We strike at three-forty-five. Sugar coma window." "The what?" "Every kid will have just eaten cake. They'll be sluggish. Disoriented. Vulnerable." He pulled out a pair of binoculars and trained them on the backyard, where a clown was assembling balloon animals with the grim efficiency of a man who hated his life. "That's our biggest threat right there. Clowns are unpredictable." Molly frowned at the pile of gift bags accumulating on a folding table. "Those presents are tiny, Bill. We drove forty minutes for what, some action figures and a coloring book?" "It's the principle." Bill lowered the binoculars. "Nobody robs a children's party. Nobody. We'll be legends. The Birthday Bandits." "That's a terrible name." "It's a working title." A piñata shaped like a unicorn dangled from the oak tree, swaying gently. Bill stared at it with the reverence of a man beholding the Mona Lisa. "Forget the presents," he breathed. "We take the piñata." Molly slowly turned her head. "Are you having a stroke?" "Think about it. That thing is *stuffed* with candy. Premium candy. The good stuff parents buy when they feel guilty." He gripped the fence. "We grab the piñata, we eat like kings for a month." "We are grown adults." "Grown adults with refined palates and empty pantries." A child shrieked somewhere in the chaos, and both thieves ducked instinctively behind a recycling bin. After a tense moment, Molly peeked over the rim. "New plan," she said. "We wait until they blindfold the kid for the piñata. Confusion reigns. We swoop in, grab everything that isn't bolted down, and exit through the side gate." Bill nodded approvingly. "Now *that's* tactical." "Obviously. One of us has a brain." "And the other has snacks." He produced a juice box from his cargo pants and stabbed it triumphantly with the tiny straw. "Reconnaissance fuel." "Where did you even get that?" "Lifted it from the cooler when you weren't looking. I'm a professional." Molly stared at him, then snatched the juice box and drained it in one long, defeated pull. "Fine. We do it. But if I get tackled by a six-year-old, you're explaining it to my mother." "Your mother thinks you work at a bank." "Exactly." She tossed the empty box over her shoulder. "Let's go ruin Timmy's special day." They crept toward the side gate, two seasoned criminals, hearts pounding, dreams enormous, dignity nowhere to be found. Somewhere ahead, the clown began to sing, and the operation officially began.
created_at
"2026-05-29T10:00:43.704Z"
5/29/2026, 10:00:43 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1304
moderation_latency
(null)
generation_time
21904
tokens_prompt
77
tokens_completion
801
native_tokens_prompt
93
native_tokens_completion
1114
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.028315
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
is_byok
false
latency
246
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
429
1
endpoint_id
"dfc0e5bd-d703-4fe2-a7bb-655eb95d5441"
id
"msg_01Ke3SMW4SNEaYfVjtrzsPYt"
is_byok
false
latency
1304
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048843-pDSlJQO1lfRBVcdGLs8h"
session_id
(null)
api_type
"completions"
id
"gen-1780048843-6SvZsfRYUHy9D4XYcEhE"
upstream_id
"msg_01Ke3SMW4SNEaYfVjtrzsPYt"
total_cost
0.028315
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
496
0.00%
Dialogue to Total Word Ratio
Ratio: 48.61%, Deviation: 18.61%
neededClean
false
wordsTotal
502
wordsDialogue
244
49.8722%