NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYX810AXE3HBT3K15E5Y159
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
20.2s
Cost:
$0.027740
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8722%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the kitchen table, smoothing the wrinkled paper with the reverence of a general planning an invasion. The crayon markings suggested the map had been drawn by someone under the age of ten, which was technically true, since they'd bought it off her nephew for two dollars. "This is the target," she said, tapping a lopsided rectangle. "The Hendersons' backyard. Saturday afternoon. Little Timmy's seventh birthday." Bill leaned over, squinting at the diagram. He smelled faintly of nacho cheese and regret. "And we're stealing... presents? From a child?" "Not just any presents." Molly's eyes gleamed. "Premium presents. The Hendersons are loaded. We're talking the deluxe LEGO sets, the limited-edition action figures, maybe even one of those tablets disguised as a toy." Bill scratched his chin. "I dunno, Molly. Robbing a kid feels like a moral low point, even for us." "Lower than the time you stole a wheelchair?" "That was a misunderstanding! The guy got up and walked away. I assumed it was abandoned." Molly waved him off and produced two clown costumes from a duffel bag, holding them up like trophies. The fabric was a violent shade of polka-dotted orange that hurt the eyes. "Our disguises," she announced. "Every kids' party has a clown. Nobody questions a clown. We blend in, we work the crowd, and when the time is right, we sweep the gift table clean." Bill held his costume at arm's length as though it might bite him. "I have a profound fear of clowns. You know this. I told you in confidence." "Which is exactly why no one will suspect you. Your terror will read as enthusiasm." He sighed, defeated, and stepped into the jumpsuit. The legs were six inches too short, leaving his hairy ankles exposed like a confession. Molly snorted but said nothing, focusing instead on the rubber nose she was attempting to mash onto her face. "Here's the plan," she continued, voice muffled. "I distract the parents with balloon animals. You position yourself near the presents. When I shout the code word, you load everything into the bag and we vanish out the side gate." "What's the code word?" "'Honk.'" Bill blinked. "Clowns honk constantly. That's their entire personality." There was a long pause as Molly considered this. The rubber nose slowly peeled off her face and dropped onto the blueprints with a sad little plop. "Fine," she conceded. "The code word is 'rutabaga.'" "Better." She gathered up the maps, suddenly businesslike. "We arrive at two. The cake comes out at two-thirty. That's our window—everyone's singing, nobody's watching the loot. We're in, we're out, we're rich in plastic dinosaurs." Bill twisted a balloon experimentally and it exploded, sending them both flinching toward the ceiling. He stared at the deflated rubber corpse in his hands. "This is going to be a disaster," he said quietly. Molly clapped a polka-dotted hand on his shoulder, her painted smile cracking into something almost genuine. "Bill," she said warmly, "it always is. That's what makes it art."
created_at
"2026-05-31T11:38:27.459Z"
5/31/2026, 11:38:27 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1071
moderation_latency
(null)
generation_time
19862
tokens_prompt
77
tokens_completion
800
native_tokens_prompt
93
native_tokens_completion
1091
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02774
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_018Tcbkanj3Q1KH7zaAFM4Lp"
is_byok
false
latency
1071
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227507-Mw3byvY3174HXLGBehmU"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227507-0jSWWIgXJcvUIQkXHk6i"
upstream_id
"msg_018Tcbkanj3Q1KH7zaAFM4Lp"
total_cost
0.02774
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
504
0.00%
Dialogue to Total Word Ratio
Ratio: 49.21%, Deviation: 19.21%
neededClean
false
wordsTotal
508
wordsDialogue
250
49.8722%