NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXJ8M2AJWYQVV4V3D5GQHQ
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
21.0s
Cost:
$0.028465
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
50.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill spread the crinkled blueprint across the table, smoothing it with the reverence of a man unveiling the Mona Lisa. "Behold," he said. "The Henderson residence. Site of one Timmy Henderson's eighth birthday party." Molly leaned in, squinting. "Bill, this is a drawing of a house. A child drew this." "I commissioned it from a reliable source." "The reliable source used a crayon. There's a sun with sunglasses in the corner." "Aurora," Bill said with dignity, "is seven and the best inside operative we've got. Now focus. The presents accumulate here, on the gift table, between two and four PM." Molly crossed her arms. "And how exactly do two grown adults blend into a children's birthday party?" "Disguises." Bill produced a paper bag and dumped its contents triumphantly. A clown wig. A unicorn onesie. A mustache. "That's a mustache." "For me." "You'll be a mustached man lurking at a child's party. They'll arrest you before you reach the cake." Bill waved this away. "Details. The plan is foolproof. You distract with balloon animals while I commandeer the gift table." "I can't make balloon animals." "You make a balloon. Call it a snake. Children are not critics." Molly picked up the unicorn onesie, holding it at arm's length like it might bite. "And I'm assuming this monstrosity is mine?" "It conveys whimsy. Trust. You'll be one of them." "I'm five foot nine, Bill. I will be a horrifying giant unicorn. The children will scream." "Screaming is chaos. Chaos is cover." He tapped the blueprint. "We're in, we're out, we're rich. Do you know what people spend on eight-year-olds now? iPads. Drones. There's a kid getting a *pony*, Molly." "We can't fit a pony in the getaway car." "The pony is symbolic of the wealth involved." Molly sighed and sat down. "Walk me through the exit." "Simple. Once the presents are bagged, I signal you with the secret word." "Which is?" "'Piñata.'" "Bill, there will be an *actual* piñata. You'll say 'piñata' and forty children will swarm a papier-mâché donkey while I stand there in a unicorn costume holding a bag of stolen Legos." A long silence settled over the table. Bill stroked his chin thoughtfully. "The contingency plan," he said slowly, "accounts for that." "There's a contingency plan?" "There is now. New signal word. 'Asparagus.' No child has ever gotten excited about asparagus." Molly stared at him for a long moment, then, against every instinct she possessed, felt the corner of her mouth twitch. "This is the dumbest job we've ever planned." "And yet," Bill said, sliding the clown wig toward her, "you haven't left." She picked up the wig, turning it over in her hands. "If I do this—*if*—I want fifty percent and you wear the unicorn." "Sixty percent and you wear the unicorn." "Fifty-five, and we never speak of the mustache again." Bill extended a hand across the table, beaming. "Molly, my dear, we're going to ruin a child's birthday." "It's the only thing we've ever been good at."
created_at
"2026-05-31T11:44:02.704Z"
5/31/2026, 11:44:02 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1032
moderation_latency
(null)
generation_time
20943
tokens_prompt
77
tokens_completion
804
native_tokens_prompt
93
native_tokens_completion
1120
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.028465
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01UaZG22ZiLq75sKE3ogtJqF"
is_byok
false
latency
1032
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227842-NN8MZO9NZk4XuHTASaX0"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227842-QSoamYZ3cqJJNNTj5QLN"
upstream_id
"msg_01UaZG22ZiLq75sKE3ogtJqF"
total_cost
0.028465
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
0.00%
Dialogue to Total Word Ratio
Ratio: 71.63%, Deviation: 21.63%
neededClean
false
wordsTotal
504
wordsDialogue
361
50.0000%