NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1GBC9J3DZMYTHXJ9X97VZ8
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
20.7s
Cost:
$0.029940
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.6885%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the table, smoothing out the wrinkles with surgical precision. "Okay, Bill, focus. The target is here." She jabbed a finger at a lopsided rectangle. "The gift table. Northwest corner of the living room." Bill squinted at the paper. "Why is the dog drawn so big? It's the size of a horse." "Because Timmy's seven and has no concept of scale. The dog is a Chihuahua named Sprinkles." "Sprinkles," Bill repeated, nodding gravely. "We can take Sprinkles." "We are not 'taking' Sprinkles. We're professionals." Molly leaned back, crossing her arms. "Now, the party starts at two. Magician arrives at two-thirty. That's our window." "Why the magician?" "Because every parent in that room will be watching a man pull a rabbit out of a hat. Nobody watches the gifts when there's a rabbit involved." Bill scratched his chin. "I could be the magician." "You cannot be the magician." "I know a card trick." "You know half of a card trick, and you always reveal the wrong card." "That's the twist," Bill said defensively. "It's avant-garde." Molly pinched the bridge of her nose. "Stay on mission. You'll be dressed as a clown. Clowns can wander freely. Nobody questions a clown carrying a giant sack." "Clowns terrify me." "That's the point. Fear creates distance. Distance creates opportunity." Bill considered this, then brightened. "What do you go as?" "The aunt nobody recognizes. Every family has one. I'll blend in, supervise the operation, and signal you when the cake comes out." "Why the cake?" "Because," Molly said slowly, as though explaining gravity, "when the candles are lit, the lights go off. Eight seconds of darkness while thirty people sing off-key. In that window, you sweep the table. Everything goes in the sack." Bill grinned. "And we walk out the back." "We walk out the back. Casual. Confident. A clown and an aunt. Beautiful." "What if a kid sees me?" "Honk your nose and run. Children respect chaos." Bill nodded, then frowned again. "Molly, I have to ask. Are we sure about this? They're, you know. Kids' presents. Coloring books. Maybe a scooter." Molly's eyes gleamed. "Bill. My sources indicate the centerpiece gift is a limited-edition gaming console. Three of them, actually. Grandparents are competing." Bill's jaw dropped. "Three?" "Three. We're not stealing childhood, Bill. We're redistributing electronics." "That's beautiful when you say it like that." "I know." Molly rolled up the map. "Synchronize watches." "I don't have a watch." "Then synchronize your phone." "It's dead." Molly stared at him for a long, painful moment. "How are you the best partner I could find?" "You said I was the only partner you could afford." "Same thing." She stood, slinging a duffel bell over her shoulder. "Get the clown suit. We strike at two. And Bill?" "Yeah?" "Whatever happens in there—the games, the juice boxes, the inflatable castle—do not get distracted." Bill saluted solemnly. "I will not bounce in the castle." "Swear it." "I swear." He paused. "Probably." Molly sighed and headed for the door.
created_at
"2026-06-01T11:50:48.893Z"
6/1/2026, 11:50:48 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1141
moderation_latency
(null)
generation_time
20656
tokens_prompt
77
tokens_completion
837
native_tokens_prompt
93
native_tokens_completion
1179
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02994
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_014FgbD7MYpSvDLXLcz9MBaG"
is_byok
false
latency
1141
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780314648-aNqk6egbaFZVsvtrgr8b"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780314648-GBviQy4Vu277C7aC1dTX"
upstream_id
"msg_014FgbD7MYpSvDLXLcz9MBaG"
total_cost
0.02994
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
505
0.00%
Dialogue to Total Word Ratio
Ratio: 77.84%, Deviation: 27.84%
neededClean
false
wordsTotal
510
wordsDialogue
397
49.6885%