NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSKB67VP6PKSGVPBM29J4M0
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
20.6s
Cost:
$0.027165
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints spread across the kitchen table weren't technically blueprints. They were a hand-drawn map of the Hendersons' backyard, complete with stick figures and a suspiciously detailed rendering of a bounce castle. Molly tapped the paper with a chewed pencil. "The bounce castle is the key. Nobody watches their kid when there's a bounce castle." Bill leaned over her shoulder, squinting. He smelled faintly of the bologna sandwich he'd eaten twenty minutes ago and refused to throw away the wrapper of. "And the presents?" "Gift table. Northeast corner. Right next to the juice boxes." "How many presents we talking?" "It's a seventh birthday party, Bill. Could be thirty kids. That's thirty presents, minimum, plus whatever the grandparents brought. Those old people go overboard. Last year my nephew got a savings bond and a kayak." Bill straightened up, his face a portrait of moral struggle. "A kayak. For a kid." "Focus." He paced the length of the tiny kitchen, which took exactly three steps. "I just don't love stealing from children, Mol. They're so small. And honest. A kid looks at you with those big eyes and you just *know* things about yourself." Molly set down her pencil with the patience of a saint who'd been pushed slightly too far. "We are not stealing from children. We're stealing from the *concept* of children's parties, which is a corrupt institution designed to make adults buy plastic garbage." "That's a stretch." "It's a philosophy." Bill grabbed an apple from the counter and pointed it at her like a weapon. "What's our cover? We can't just walk in. We're forty-three years old. We don't have a child." "We rent one." The apple froze halfway to his mouth. "We *what?*" "Relax, I don't mean literally. My sister owes me a favor. We borrow little Tyler, we waltz in like normal parents, we mingle, we eat some cake, and during the piñata frenzy, we load up the gift bags and stroll out the back gate." Bill chewed his apple thoughtfully, which was unusual for him because he typically swallowed food whole like a snake. "The piñata frenzy." "It's chaos, Bill. Pure, candy-fueled chaos. Twenty kids swinging a bat at a cardboard donkey while sugar courses through their veins. No adult is paying attention to anything except keeping their offspring alive." "And Tyler? He just plays along?" "Tyler's six. I'll give him a lollipop. He'd help us bury a body for a lollipop." Bill set the apple core down on her map, directly over the bounce castle. Molly stared at it, then at him, then at the universe in general. "There's just one problem," he said. "What." "I already RSVP'd to this party. For real. The Hendersons are my cousins." Molly closed her eyes and breathed in slowly through her nose, the way her therapist had taught her before she stopped going to her therapist to pursue crime full-time. "So we're not stealing from strangers," she said. "We're stealing from family." Bill grinned. "Now *that* I can get behind."
created_at
"2026-05-29T10:09:12.967Z"
5/29/2026, 10:09:12 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1303
moderation_latency
(null)
generation_time
20545
tokens_prompt
77
tokens_completion
788
native_tokens_prompt
93
native_tokens_completion
1068
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.027165
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_017FdnDUBJhHujERkZpXNEKw"
is_byok
false
latency
1303
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780049352-puENASzNzOkW5Bycjufc"
session_id
(null)
api_type
"completions"
id
"gen-1780049352-QP0LtDjSUOeIDzBxfrrq"
upstream_id
"msg_017FdnDUBJhHujERkZpXNEKw"
total_cost
0.027165
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
0.00%
Dialogue to Total Word Ratio
Ratio: 61.26%, Deviation: 31.26%
neededClean
false
wordsTotal
506
wordsDialogue
310
49.9920%