NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1G1JEV55B3T1P3TB5DZFBV
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
20.6s
Cost:
$0.029790
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9921%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the table and tapped it with her pen. "Okay, Bill, focus. The target is the Henderson residence. Tommy Henderson turns seven on Saturday." "Seven," Bill repeated. "What kind of monster steals from a seven-year-old?" "A professional. Now, intel suggests we're looking at twenty, maybe thirty wrapped presents." "That's a lot of presents." "That's a *score*, Bill. You think those kids are unwrapping socks? We're talking premium merchandise. Action figures. That dancing robot dog everyone's losing their minds over." Bill leaned forward. "The robot dog? The one with the laser eyes?" "The very same." "I'm in. But how do we get past security?" Molly raised an eyebrow. "Security is one mom named Karen and a clown she hired off the internet." "Clowns are unpredictable, Molly. I knew a guy, Tickles, did three years for assault with a seltzer bottle." "We'll handle the clown. Here's the plan. We arrive disguised as guests. You're an uncle. I'm a children's entertainer." "What kind of entertainer?" "Balloon animals." Bill folded his arms. "I can't make balloon animals." "You don't have to *make* them, you just have to look like you *might*. Confidence, Bill. Nobody questions a person holding balloons." "And then?" "At precisely two o'clock, they cut the cake. Everyone gathers around. That's our window. We slip into the den, load the presents into garbage bags, and walk out the back like we own the place." "Garbage bags? Classy." "You want monogrammed luggage at a kid's party? We blend in. People throw out wrapping paper constantly." Bill scratched his chin. "What if a kid catches us?" "Then you improvise. You're the fun uncle, remember? Distract them." "How?" "I don't know, Bill! Pull a quarter out of their ear. Tell them a story. Pretend the floor is lava." "The floor is lava," Bill murmured. "Classic." "Now, the extraction vehicle. The minivan's parked out front, side door open, engine running." "We stole a minivan?" "We *rented* a minivan. We have standards. Stickers on the back. Little stick-figure family. Two parents, three kids, a dog." "Nice touch." "I think of everything." Molly rolled up the map. "Any questions?" Bill hesitated. "Just one. The cake. Is there gonna be cake?" "Bill." "Because if there's cake, and we're already there—" "We are not eating the cake." "A small piece. For cover. So we look natural." Molly pinched the bridge of her nose. "Fine. One piece. *After* the presents are secured." "Yes!" "But if you get frosting on the merchandise, I swear—" "I'll be a professional, Molly. Scout's honor." "You were never a scout." "I stole a scout's uniform once. Counts for something." Molly stood, slinging a bag over her shoulder. "Saturday. Two o'clock. Be there, be disguised, and for the love of everything, don't actually pop the balloons." "What if a kid asks for a giraffe?" "Then you tell them," Molly said, opening the door, "that the giraffe is on backorder." She walked out. Bill stared at the balloons. "I really hope there's cake," he whispered.
created_at
"2026-06-01T11:45:27.52Z"
6/1/2026, 11:45:27 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1098
moderation_latency
(null)
generation_time
20535
tokens_prompt
77
tokens_completion
842
native_tokens_prompt
93
native_tokens_completion
1173
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02979
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01Rifj62xCRnc2iosVL4rr72"
is_byok
false
latency
1098
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780314327-hvpJltCTek3EAt9Tbhj3"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780314327-Q4xl0C2LKIZocGm3Wa4K"
upstream_id
"msg_01Rifj62xCRnc2iosVL4rr72"
total_cost
0.02979
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
0.00%
Dialogue to Total Word Ratio
Ratio: 85.97%, Deviation: 15.97%
neededClean
false
wordsTotal
506
wordsDialogue
435
49.9921%