NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1HGR37BNSEQ8P2A5CBHZQY
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
21.2s
Cost:
$0.030315
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9995%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the kitchen table, smoothing out the wrinkles with the reverence of a surgeon preparing for a transplant. Bill leaned over her shoulder, chewing a toothpick. "So this is the layout?" he asked. "This is it. The Hendersons' backyard. Birthday party for little Timmy, who turns seven on Saturday." Bill squinted at the paper. "Molly, this is a child's drawing of a unicorn." "It's a *map* of a unicorn. The presents are stacked right here, near the horn." She tapped the glittery paper. "Premium real estate." "And how exactly did you obtain this intel?" "I bribed Timmy's older sister with a candy bar. Kid drives a hard bargain. Wanted two." Bill straightened up, pacing the small kitchen. "Let me get this straight. We're professional thieves. We've cracked vaults. We robbed a casino in Monaco. And now we're stealing presents from a seven-year-old?" "Think about it," Molly said, eyes gleaming. "Cash, gift cards, the latest gaming consoles. Parents go all out these days. There's gotta be three grand in loot sitting on that lawn." "Three grand," Bill repeated, scratching his chin. "And the security?" "A clown named Bubbles, two dozen sugar-high kids, and Mr. Henderson, who I'm told falls asleep in his lawn chair by two." "So our biggest threat is a clown." "Don't underestimate Bubbles. I've seen his act. Man can juggle chainsaws." Bill paled slightly. "Chainsaws?" "Balloon chainsaws. But the commitment is terrifying." He sat down across from her, finally intrigued. "Alright. What's the plan? How do we get in?" Molly grinned and pulled a paper bag onto the table. From it she produced two enormous, polyester costumes—one a sad-looking dinosaur, the other a moth-eaten bunny rabbit. "Absolutely not," Bill said immediately. "You'll be the bunny. Kids love bunnies. You blend in, work the crowd, and when the moment's right—" "When the moment's right, I'm having a nervous breakdown inside a rabbit suit." "Bill." She placed a hand on his furry-future shoulder. "We waltz in as entertainment. Nobody questions the entertainment. We dance, we wave, we slowly back the van up to the unicorn's horn, and we load up while Bubbles distracts everyone with his act." "And if someone asks who hired us?" "You're allergic to questions. You're a method performer. You only speak Dinosaur." "I thought I was the bunny." "You're a bunny who *identifies* as a dinosaur. It's very progressive. Confuses everyone." Bill stared at the costume, then at the unicorn map, then at Molly's hopeful, criminal face. He sighed the sigh of a man who knew his dignity was already lost. "Three grand?" "Minimum." "And we split it fifty-fifty?" "Sixty-forty. I drew up the unicorn." "You bought a child's drawing for a candy bar." "*Two* candy bars, Bill. I made sacrifices." He picked up the bunny costume, examining a suspicious stain near the collar. "Fine. But if Bubbles comes at me with those balloon chainsaws, I'm using a toddler as a shield." Molly beamed. "Now you're thinking like a professional."
created_at
"2026-06-01T12:11:13.387Z"
6/1/2026, 12:11:13 PM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1441
moderation_latency
(null)
generation_time
21167
tokens_prompt
77
tokens_completion
816
native_tokens_prompt
93
native_tokens_completion
1194
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.030315
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01UDpjZ9eEYpKf4D3UqoQ2R5"
is_byok
false
latency
1432
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780315873-ZMC5d1WDD2EMp4QcbWGM"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780315873-P1m6CHUwZWYDD3gXhzOU"
upstream_id
"msg_01UDpjZ9eEYpKf4D3UqoQ2R5"
total_cost
0.030315
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
499
0.00%
Dialogue to Total Word Ratio
Ratio: 69.23%, Deviation: 19.23%
neededClean
false
wordsTotal
507
wordsDialogue
351
49.9995%