NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1HX30H97X8HJ7MK210VYEY
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
19.4s
Cost:
$0.027065
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
0.14%
Dialogue to Total Word Ratio
(details)
37.6476%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the table, smoothing it with both palms. "Okay, so the perimeter is guarded by approximately fifteen seven-year-olds and one clown." "A clown?" Bill leaned forward. "You didn't mention a clown." "It's a children's party, Bill. There's always a clown." "I don't do clowns. You know I don't do clowns." "It's a hired entertainer named Sprinkles. He's not going to murder you." Bill crossed his arms. "That's exactly what a murder-clown wants you to think." "Can we focus?" Molly tapped the blueprint. "The presents are stacked here, by the gift table. We go in during the magic show, grab everything, and we're gone before they cut the cake." "And how do we get past the parents?" "Disguises." She reached into a duffel bag and produced two crumpled shirts. "We're the entertainment. I'm a face painter, you're my assistant." "Why am I the assistant?" "Because last time you were in charge, we robbed the wrong house." "That was one time, and the family was very understanding." "They called the police, Bill." "After they offered us lemonade." He picked up a shirt and grimaced. "This says 'Twinkle the Magnificent.' I don't feel magnificent." "You'll feel magnificent when we're sitting on forty action figures and a bouncy castle's worth of birthday loot." Bill pulled the shirt over his head. "What about the kid? The birthday boy? Doesn't he get to keep one present?" Molly paused. "Are you developing a conscience?" "No. I'm developing a strategy. If we leave one present, the kid stays happy, the kid stays quiet, and nobody screams during the getaway." "Huh." She considered this. "That's actually not stupid." "I have moments." "Rare ones." Molly rolled up the blueprint. "Fine. We leave the kid one present. The smallest one. Socks, probably." "Nobody gives a kid socks at a birthday party." "My aunt gave me socks every year." "And look how you turned out." Molly glared at him. "Grab the bags. We move at fourteen hundred hours." "Why do you always say it like that? Just say two o'clock." "It sounds more professional." "We're stealing from a child, Molly. There's no professional version of that." She slung a bag over her shoulder. "Remember, if anyone asks, we're contractors hired by the party planner. Stay calm, stay confident, and whatever you do, do not engage with the clown." "I told you, I'm not going near the clown." "Good." "Unless he engages with me first, in which case I'm running." "Bill." "I'm just being honest about my limitations." Molly opened the door. "One more thing. When we get inside, you handle the gift table. I'll create a distraction." "What kind of distraction?" She smiled. "I'm going to challenge Sprinkles to a balloon-animal contest." Bill stared at her. "You hate clowns more than I do." "Which is why," she said, stepping into the hallway, "I'll be very, very motivated to win." The door clicked shut behind them.
created_at
"2026-06-01T12:17:57.785Z"
6/1/2026, 12:17:57 PM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1345
moderation_latency
(null)
generation_time
19413
tokens_prompt
77
tokens_completion
793
native_tokens_prompt
93
native_tokens_completion
1064
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.027065
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01GkPCyUuKpzmGPnX2i3wWJb"
is_byok
false
latency
1339
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780316277-PQl1wNhIiO2VXTGwae2Q"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780316277-IPKcBW0WgBe6rKIzpUTq"
upstream_id
"msg_01GkPCyUuKpzmGPnX2i3wWJb"
total_cost
0.027065
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
487
0.14%
Dialogue to Total Word Ratio
Ratio: 81.47%, Deviation: 11.47%
neededClean
false
wordsTotal
491
wordsDialogue
400
37.6476%