NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXM3EV5XTCY5QNZHCNAPM3
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
20.6s
Cost:
$0.028515
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
0.12%
Dialogue to Total Word Ratio
(details)
50.0220%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the table and tapped it with authority. "Okay, Bill, focus. The target is a seventh birthday party. Theme is, and I quote the invitation we intercepted, 'Dinosaurs and Princesses.'" "That's two themes," Bill said. "That's chaos. No commitment." "Stay on mission. The presents will be stacked on a table here, by the bouncy castle." "A bouncy castle." Bill leaned back, crossing his arms. "Molly, I don't like bouncy castles. They're unstable. Untrustworthy. My cousin lost a tooth in one." "Nobody is asking you to enter the bouncy castle." "I'm just establishing my boundaries." "Noted. Now, the way I see it, we go in dressed as entertainers. You'll be the magician, I'll be the face painter." Bill's eyebrows shot up. "Why do you get to be the face painter?" "Because you can't draw, Bill. You tried to sketch the getaway route and it looked like a sad worm." "That worm had ambition." "The point is, while you distract the children with card tricks, I grab the presents and load them into the van disguised as the ice cream truck." "Wait." Bill held up a hand. "If we're an ice cream truck, the kids are going to want ice cream." Molly paused. "We give them ice cream. It enhances the cover." "With what money? We're stealing presents because we're broke!" "It's an investment, Bill." "You can't run a heist and a small business simultaneously!" "Lower your voice." Molly glanced at the window. "Now, the only real obstacle is the mother. Her name is Karen—" Bill snorted. "Of course it is." "—and she runs that household like a military operation. Snacks rotated every fifteen minutes. Goody bags inventoried. She will notice if presents go missing." "So what's the plan?" "We replace them. We bring decoy presents. Empty boxes wrapped beautifully. By the time anyone opens them, we're three counties away, eating gas-station nachos like kings." Bill considered this. "That's actually... not terrible." "Don't sound so surprised." "I'm just saying, usually your plans involve more crying. Mine or yours." "This plan is foolproof." Molly straightened. "There is only one variable I can't control." "Which is?" "The clown." Bill froze. "There's a clown?" "Karen hired a professional. Goes by 'Mr. Giggles.' Allegedly very territorial about his audience." "Molly, I am not getting into a turf war with a clown." "You won't have to, as long as you stick to balloon animals and don't oversell the magic." "What if he challenges me?" "Then you challenge him back. You're a professional. Sort of." Bill stood, suddenly resolute. "You know what? Fine. Let's do it. For the presents. For the nachos." "For the nachos," Molly agreed. "One question." "Yes?" "Can I keep one present? Just one. The big shiny one." Molly stared at him for a long moment. "You want a child's toy." "I want to feel something, Molly." She sighed, rolling up the crayon map. "Fine. One present. But if it's a doll, I'm telling everyone." "Deal."
created_at
"2026-05-31T11:45:02.954Z"
5/31/2026, 11:45:02 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1125
moderation_latency
(null)
generation_time
20598
tokens_prompt
77
tokens_completion
818
native_tokens_prompt
93
native_tokens_completion
1122
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.028515
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01YNCqc1EfGS45oDUiSnRFLy"
is_byok
false
latency
1122
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227902-lbQvejo7C88tk5cqEbGk"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227902-b8zRWOu8bQujAPo2e820"
upstream_id
"msg_01YNCqc1EfGS45oDUiSnRFLy"
total_cost
0.028515
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
497
0.12%
Dialogue to Total Word Ratio
Ratio: 81.56%, Deviation: 11.56%
neededClean
false
wordsTotal
499
wordsDialogue
407
50.0220%