NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXDXMZC93P14CF41NBWZFT
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
21.2s
Cost:
$0.029290
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
93.65%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
46.8248%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the table, smoothing it flat with the reverence of a surgeon. Bill leaned in, squinting at the crayon drawing. "This is a map of a McDonald's PlayPlace," he said. "It's the layout of the party venue, Bill. I had to draw it from memory." "You drew a clown with three eyes." "That's Pickles the Magician. He's a complication." Molly tapped the paper. "He does balloon animals at the two o'clock mark. That's our window." Bill folded his arms. "Let me get this straight. We're casing a seven-year-old's birthday party." "Theodore's eighth, actually. And do you know what an eight-year-old gets these days? Tablets, gaming consoles, a remote-control drone worth four hundred dollars. We're talking premium electronics, all gift-wrapped and stacked in one room." "And guarded by?" "Approximately twelve children and a bouncy castle." Bill rubbed his temples. "I have a bad feeling about the bouncy castle." "The bouncy castle is the least of your worries." Molly produced a second sheet. "Here's the schedule. Cake at one-thirty. Presents opened at three. We strike during the piñata at two-fifteen, when chaos is at maximum and every adult is filming." "Why the piñata specifically?" "Because, Bill, when twelve sugar-crazed children are swinging a bat blindfolded, no parent is looking at the gift table. They're looking at their kids, praying nobody loses a tooth." He nodded slowly. "That's actually disturbingly smart." "I have my moments." She slid a duffel bag across the floor. "Costumes. You're going as the party entertainer's assistant." Bill unzipped it and recoiled. "This is a hot dog suit." "It's a costume. Nobody questions a man in a hot dog suit at a children's party. You're practically furniture." "I can't bend my arms in this." "You don't need arms. You need to be a distraction. While Pickles fumbles his rabbit and I load the gifts, you run around being a hot dog. Children love that." Bill held the suit at arm's length like it was radioactive. "And if something goes wrong?" "Nothing goes wrong. We've accounted for every variable." Molly began counting on her fingers. "The magician, the piñata, the parents, the snack table—" "What about the kids?" She paused. "What about them?" "Molly, children are tiny and feral and they outnumber us six to one. They don't follow plans. They bite. My nephew once headbutted a police horse." Molly waved a dismissive hand. "Children are predictable. Offer them sugar, point at something shiny, done." "You said that about the Doberman last month. I still limp when it rains." "This is completely different. The Doberman didn't believe in birthday wishes." She zipped the blueprint into a tube. "Trust me. Forty-five seconds, in and out, and we're sitting on ten grand in toys by sundown." Bill sighed and stepped one leg into the hot dog suit. "If I die in this thing, I'm haunting you." "If you die in that thing," Molly said, grinning, "I'm selling the footage."
created_at
"2026-05-31T11:41:40.395Z"
5/31/2026, 11:41:40 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3063
moderation_latency
(null)
generation_time
21127
tokens_prompt
77
tokens_completion
851
native_tokens_prompt
93
native_tokens_completion
1153
native_tokens_completion_images
(null)
native_tokens_reasoning
59
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02929
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_0197xcqWSzrv2hS581ZKGBV2"
is_byok
false
latency
1349
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227700-7XqaHPfQrHMolm34lWwa"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227700-61mKo85Qox5S0E2nWsOl"
upstream_id
"msg_0197xcqWSzrv2hS581ZKGBV2"
total_cost
0.02929
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
93.65%
Matches word count
n/a
neededClean
false
words
491
0.00%
Dialogue to Total Word Ratio
Ratio: 79.24%, Deviation: 29.24%
neededClean
false
wordsTotal
501
wordsDialogue
397
46.8248%