NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1HXC49S9CCV3HH2KBJ89E2
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
19.9s
Cost:
$0.028740
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
40.6364%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the table, smoothing out the wrinkles with the reverence of a surgeon preparing for an operation. The paper was actually a child's birthday invitation, blown up at the copy shop to poster size. "Okay, here's the plan," she said, tapping the glossy cartoon dinosaur in the corner. "The party starts at two. We hit it at two-thirty, peak chaos. Sugar highs, screaming, parents losing their minds." Bill leaned in, squinting. "Why is there a dinosaur on our heist map?" "Because that's the theme, Bill. Keep up." "And the X marks the presents?" "The X marks the *snack table*. The presents are over here." She slid her finger to a hand-drawn rectangle labeled *LOOT*. "Stacked by the gift table. We grab everything, load it into the bouncy castle, and deflate." Bill scratched his head. "We're escaping in a bouncy castle?" "We're escaping *with* the bouncy castle. It folds down to the size of a suitcase. Nobody suspects a man carrying a deflated castle." "That's actually genius," Bill admitted. "But what about the kids? They'll notice their presents vanishing." Molly waved a dismissive hand. "Children have the attention span of goldfish. We introduce a distraction. You release the piñata." "What piñata?" "The piñata I've stuffed with thirty dollars in loose change. The moment that thing bursts, it's a feeding frenzy. Total anarchy. We waltz out the back with our arms full of wrapped boxes." Bill nodded slowly, then frowned. "Wait. What if there's a clown? I hate clowns." "Bill. We are professional thieves. We've cracked vaults, dodged lasers, outrun the police on three continents. You're scared of a clown?" "Clowns are unpredictable," he said defensively. "Vaults don't make balloon animals at you." Molly pinched the bridge of her nose. "Just focus. Your job is the bounce house and the piñata. My job is the presents. We're in and out in ninety seconds." "And the cake?" "Leave the cake." "But it's a dinosaur cake." "Bill." "I just think we could grab the cake. As a bonus." Molly stared at him for a long, heavy moment. "We are not stealing a child's birthday cake. We have *some* standards." "We're literally stealing all of their presents." "That's *different*," she snapped. "Presents are reusable. We can fence those. A half-eaten cake is just sad." She rolled up the invitation-blueprint. "Now, do you have the disguises?" Bill reached into a duffel bag and pulled out two enormous foam costumes. "One Triceratops. One Stegosaurus." "Perfect. We blend in with the entertainment, nobody looks twice." She held up the Triceratops head, inspecting it. "Wait, why does mine smell like ranch dressing?" "I wore it to a thing last week. Don't ask." Molly grimaced but pulled it on anyway. Her voice came out muffled. "Synchronize watches. We move at fourteen-thirty hundred hours." Bill checked his wrist. "I don't have a watch." "Then synchronize your *vibe*, Bill. Let's go ruin a child's special day."
created_at
"2026-06-01T12:18:07.12Z"
6/1/2026, 12:18:07 PM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1093
moderation_latency
(null)
generation_time
19859
tokens_prompt
77
tokens_completion
793
native_tokens_prompt
93
native_tokens_completion
1131
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02874
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01W1SPqk9MdMgmVjZXbaPeS4"
is_byok
false
latency
1093
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780316287-qRHouEeiUqbGdAzdn5BP"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780316287-ulsi8vg4HAulDbawQsdn"
upstream_id
"msg_01W1SPqk9MdMgmVjZXbaPeS4"
total_cost
0.02874
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
488
0.00%
Dialogue to Total Word Ratio
Ratio: 70.99%, Deviation: 20.99%
neededClean
false
wordsTotal
493
wordsDialogue
350
40.6364%