NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYX9RVDEHM0NEWBBXR388V5
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
21.6s
Cost:
$0.028515
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
3.62%
Matches word count
(details)
15.22%
Dialogue to Total Word Ratio
(details)
9.4203%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints spread across the kitchen table looked suspiciously like a placemat from a pizza restaurant, mostly because that's exactly what they were. Bill had drawn a series of arrows in red crayon, the only writing implement he could find that morning. "So this here," he said, tapping a greasy circle near the breadsticks advertisement, "is the bouncy castle. That's our entry point." Molly squinted at him over her coffee. "We're breaking in through a bouncy castle?" "Nobody suspects a man who arrives by bounce." She set down her mug with the exhausted patience of someone who had heard worse plans, and recently. The Henderson birthday party started at two o'clock, which gave them exactly four hours to prepare for what Bill insisted on calling "the greatest gift robbery in suburban history." The target was a seven-year-old named Tyler, whose grandmother was reportedly very, very wealthy and very, very generous. "The presents will be stacked here," Molly said, pointing to a more realistic spot near the gift table. "We grab them during the magic show. Everyone's distracted, lights are low, the magician pulls focus." "What if the magician's actually good?" "He's a man named Mr. Wiggles who does birthday parties for forty dollars. He is not good." Bill nodded, satisfied, and began stuffing balloons into a duffel bag. This was part of his disguise. He had decided, against all advice, to attend as a clown. Molly would pose as a face-painting artist, which at least required no special equipment beyond a steady hand and the willingness to draw butterflies on screaming toddlers. "Remember the rules," she said. "No eating the cake. No matter how good it looks." "I make no promises." "Bill." "Last time was different. That was a wedding. This is professional." They loaded the van, a vehicle that read PARTY MAGIC in letters that were clearly painted over a previous logo for a plumbing company, and drove across town in tense silence. Molly reviewed the plan in her head. Bill reviewed the snacks he'd packed in his clown pockets. The Henderson house was enormous, festooned with streamers and a banner that read HAPPY BIRTHDAY TYLER in glitter that would haunt the lawn for years. Children swarmed like tiny, sugar-fueled locusts. Parents stood clustered near a punch bowl, exchanging the haunted looks of people who had not slept since 2019. "There," Molly whispered. "The gift table. Twelve o'clock." Bill, sweating beneath a rubber nose, gave a thumbs up that knocked over a pyramid of juice boxes. The magic show began promptly at three. Mr. Wiggles, true to expectations, fumbled a dove that immediately flew into a ceiling fan. In the chaos, Molly and Bill made their move, scooping presents into an enormous garbage bag with the efficiency of seasoned professionals. They were nearly to the door when a small voice piped up. "Are you stealing my presents?" Tyler stood there, arms crossed, frosting on his face. Bill froze. Molly froze. The boy stared. "No," Bill finally said. "We're, uh, taking them to be cleaned." Tyler considered this. Then he shrugged and walked away, fully convinced. They never ran so fast in their lives.
created_at
"2026-05-31T11:39:24.41Z"
5/31/2026, 11:39:24 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
869
moderation_latency
(null)
generation_time
21538
tokens_prompt
77
tokens_completion
832
native_tokens_prompt
93
native_tokens_completion
1122
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.028515
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01HmMzWfywSMfjZhVZgboAG4"
is_byok
false
latency
869
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227564-e4qjQ67l4E1ttZonuicA"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227564-wvkVKXZfVs31sJPi8ctt"
upstream_id
"msg_01HmMzWfywSMfjZhVZgboAG4"
total_cost
0.028515
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
3.62%
Matches word count
n/a
neededClean
false
words
524
15.22%
Dialogue to Total Word Ratio
Ratio: 23.86%, Deviation: 6.14%
neededClean
false
wordsTotal
528
wordsDialogue
126
9.4203%