NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJW98SEGBNQ349AA48D8W4
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
19.6s
Cost:
$0.028665
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9595%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the kitchen table and smoothed it flat with surgical precision. "Okay," she said. "The objective is simple. We get in, we grab the presents, we get out. Estimated haul: forty gift bags and one suspiciously large box." Bill peered at the map, frowning. "Why is there a clown drawn on this?" "That's not a clown. That's the security threat. His name is Giggles and he's working the party from two to four." "You scouted a clown?" "I scout everything, Bill. That's why I'm the brains of this operation." Bill sat back and crossed his arms. "I thought I was the brains." "You're the muscle." "I weigh a hundred and forty pounds." "You're the *aspirational* muscle." Molly tapped the map. "Now focus. The party's at the Hendersons'. Backyard bouncy castle, a piƱata, and approximately fifteen children hopped up on sugar and rage." Bill swallowed. "Fifteen?" "Children are unpredictable. They travel in packs. They have no fear and tiny, grabbing hands." She produced a second drawing. "This is our entry point. We come in disguised as the entertainment." "As clowns?" "As a magician and his assistant." Bill brightened. "Do I get a cape?" "You get a cape." "I'm in." Molly rolled up the map. "Here's how it plays. I distract the parents with a card trick. You slip into the gift pile, load everything into the magic trunk, and we wheel it out the front. Nobody suspects the magic trunk. That's the beauty of the magic trunk." "And if a kid sees me?" "You say it's part of the show. Children believe anything if you say it loud and confidently." Bill nodded slowly, absorbing this wisdom. "What about the suspiciously large box?" "That," Molly said gravely, "is the prize. Probably a bike. Could be a pony." "Nobody gives a kid a pony at a birthday party." "The Hendersons are show-offs, Bill. Keep up." He leaned in. "And the cake? Do we take the cake?" "We do not take the cake." "Why not?" "Because we're thieves, not animals." She paused. "Also it's hard to fence a sheet cake." Bill considered this. "Fair." Molly stood and began pacing, the way she always did when the plan was nearing completion. "Two o'clock, we arrive. Two-fifteen, the card trick. Two-twenty, you load the trunk. Two-thirty, we're gone, and the only thing left is fifteen crying children and a clown with no audience." "It's almost beautiful," Bill said. "It's a masterpiece of crime." "One question." "Shoot." He raised a tentative hand. "Do we actually know any magic tricks?" Molly froze. Her pacing stopped. A long, terrible silence filled the kitchen. "I can do the one where the coin disappears," she said finally. "Can you make it come back?" "...No." Bill exhaled. "So we're stealing presents *and* a coin." "It's a victimless crime if you don't count the children." "And the coin's original owner." "And the coin's original owner," she agreed. She extended her hand. "Partner?" Bill shook it. "Partner. I really want that cape."
created_at
"2026-05-29T10:01:04.544Z"
5/29/2026, 10:01:04 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
972
moderation_latency
(null)
generation_time
19426
tokens_prompt
77
tokens_completion
814
native_tokens_prompt
93
native_tokens_completion
1128
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.028665
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
is_byok
false
latency
181
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
429
1
endpoint_id
"dfc0e5bd-d703-4fe2-a7bb-655eb95d5441"
id
"msg_01Jm99VtrsmNhqhBnk8vKYip"
is_byok
false
latency
972
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048864-wweFWZ6nDjYEWsICTl0X"
session_id
(null)
api_type
"completions"
id
"gen-1780048864-GBoaztUDGHcw0fH9zcUk"
upstream_id
"msg_01Jm99VtrsmNhqhBnk8vKYip"
total_cost
0.028665
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
503
0.00%
Dialogue to Total Word Ratio
Ratio: 77.17%, Deviation: 27.17%
neededClean
false
wordsTotal
508
wordsDialogue
392
49.9595%