NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJK9CA1AF063G976RQHH08
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
21.0s
Cost:
$0.028940
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
0.37%
Dialogue to Total Word Ratio
(details)
49.5401%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the table, smoothing out the wrinkles with reverence. "Okay, Bill, walk me through it one more time." "We've been over this fourteen times." "And we'll go over it fifteen. Where's the entry point?" Bill jabbed a finger at the paper. "Bouncy castle. We come in through the deflation valve, roll across the lawn, and pop up behind the juice table." "And the presents?" "Stacked on the gift table. North wall. Right next to the piñata." Molly narrowed her eyes. "The piñata's a liability. Too many kids with sticks." "We avoid the piñata." "You can't *avoid* the piñata, Bill. It's a war zone over there. Last job, Donny took a candy-cane bat to the kneecap and limped for a month." "That was a Christmas party." "The principle stands." Bill sighed and leaned back. "Fine. We time our approach for the cake. Once they cut that cake, every kid in a ten-foot radius is hypnotized. We grab the loot and we're gone before they've licked their forks." "What about the clown?" "What clown?" Molly tapped a sinister stick-figure scribble in the corner. "Mr. Wiggles. Hired entertainment. He works the room, balloon animals, the whole bit. He's mobile. Unpredictable." "So we take out the clown." "We are *not* taking out the clown." "I didn't mean—" Bill pinched the bridge of his nose. "I meant we distract him. You request a balloon giraffe, he gets to twisting, and that buys me thirty seconds at the gift table." "And if he makes a balloon sword?" "Why would he make a sword?" "Clowns are erratic, Bill. You don't know his mind." "Nobody knows his mind! That's the nature of clowns!" Molly held up her hands. "All I'm saying is we need a contingency. What if a parent spots us?" "We're dressed as parents." "You're wearing a ski mask." Bill paused. "I thought that added mystery." "It adds a *felony*, Bill. Take the mask. Wear the cardigan. Blend in. Sip a juice box. Compliment someone's potato salad." "I hate potato salad." "Then *pretend*, like a professional." Bill grumbled and pulled the cardigan from his bag. "What's the extraction look like?" "We load the presents into the gift bags they hand out at the door. Nobody questions a guest leaving with a goodie bag." "That's..." Bill blinked. "That's actually brilliant." "I have my moments." Molly rolled up the map. "One more thing. The birthday kid." "What about him?" "Her. Penelope. Turning seven. She's the target's daughter, and she does *not* miss things. Eyes like a hawk. If she clocks us, the whole operation collapses." "So we keep her busy." "How?" Bill grinned slowly. "We let her win at musical chairs." Molly stared at him. For a long moment, neither spoke. "Bill," she said quietly, "that might be the smartest thing you've ever said." "I have my moments too." She slung the bag over her shoulder. "Grab the cardigan. We've got a party to ruin." "For the record," Bill said, following her out, "I still don't trust that clown."
created_at
"2026-05-29T09:56:09.748Z"
5/29/2026, 9:56:09 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1176
moderation_latency
(null)
generation_time
20965
tokens_prompt
77
tokens_completion
814
native_tokens_prompt
93
native_tokens_completion
1139
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02894
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01QSx1yMigcJTaJiF33e1FdD"
is_byok
false
latency
1176
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048569-Q3snNBCdCixRpuSjRbid"
session_id
(null)
api_type
"completions"
id
"gen-1780048569-xYIgy8Z1bMQpuOkn2GRc"
upstream_id
"msg_01QSx1yMigcJTaJiF33e1FdD"
total_cost
0.02894
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
506
0.37%
Dialogue to Total Word Ratio
Ratio: 80.59%, Deviation: 10.59%
neededClean
false
wordsTotal
510
wordsDialogue
411
49.5401%