NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1GWMNPZMXAPM0EAS414TVY
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.3s
Cost:
$0.028390
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
6.35%
Dialogue to Total Word Ratio
(details)
53.1738%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the kitchen table, smoothing out the wrinkles with the reverence of a surgeon prepping for a delicate operation. Bill leaned over her shoulder, squinting at the crayon-drawn map. "Why's the bounce house circled in red?" he asked. "That's our extraction point," Molly said, tapping the soggy paper. "Nobody suspects the bounce house." The blueprint was, technically, a child's drawing they'd liberated from the recycling bin behind the Hendersons' house. Stick figures dotted the lawn. A lopsided sun smiled down on what appeared to be a clown of nightmarish proportions. Bill straightened up, cracking his knuckles. "I don't like clowns, Molly." "Nobody likes clowns. That's their whole thing." He paced the length of the kitchen, his oversized cargo pants—chosen specifically for present-smuggling capacity—swishing with each step. Molly had laid out their equipment on the counter: two ski masks, a sack labeled SWAG in permanent marker, and an alarming quantity of party hats. "The hats are camouflage," she explained before he could ask. "We blend in with the parents." "We're forty-three years old and I have a mustache you could lose a badger in." "Exactly. Disgruntled uncle. Nobody questions a disgruntled uncle at a children's party." Bill picked up a party hat and stretched the elastic strap experimentally. It snapped against his chin with a wet thwack. He didn't flinch, which Molly admired about him. "Walk me through the timeline again," he said. She pointed to the drawing. "Twelve hundred hours, the cake comes out. Maximum distraction. Every kid in a fifteen-foot radius goes feral. That's when we hit the present pile." "And the parents?" "Filming the cake on their phones. Documenting. Posting. They wouldn't notice a freight train if it had a birthday candle on it." Bill nodded slowly, the gears in his head visibly turning, which was less a metaphor and more an unfortunate consequence of how loudly he thought. He set down the hat. "What about the goodie bags?" Molly looked up sharply. "We're not animals, Bill. The goodie bags stay. Those kids worked hard for those goodie bags." "They didn't work for anything, they just showed up—" "The goodie bags stay." A silence settled between them, thick with moral consideration. Somewhere outside, an ice cream truck played a warped, melancholy tune that neither of them acknowledged. Bill finally sighed and grabbed the SWAG sack. "Fine. Just the presents. But if there's a piñata, it's mine." "You can have the piñata." "And the candy inside it." "Don't push it." Molly rolled up the blueprint and tucked it into her jacket, then handed Bill a party hat with the solemnity of a knighting. He placed it on his head at an angle he believed conveyed authority but actually conveyed only that he was a man wearing a tiny cone. She studied him, this absurd partner of hers, this badger-mustachioed accomplice in petty crime, and felt something dangerously close to affection. "Ready?" she asked. Bill adjusted his hat. "Let's go ruin a child's birthday."
created_at
"2026-06-01T12:00:14.522Z"
6/1/2026, 12:00:14 PM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1086
moderation_latency
(null)
generation_time
19232
tokens_prompt
77
tokens_completion
818
native_tokens_prompt
93
native_tokens_completion
1117
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02839
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01Mc3js4yVK5f8qHGSjuEDwK"
is_byok
false
latency
1086
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780315214-nn571mnJ7fOdCrcQom7d"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780315214-OfrYl8covngStnfOo6jB"
upstream_id
"msg_01Mc3js4yVK5f8qHGSjuEDwK"
total_cost
0.02839
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
6.35%
Dialogue to Total Word Ratio
Ratio: 37.43%, Deviation: 7.43%
neededClean
false
wordsTotal
505
wordsDialogue
189
53.1738%