NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXKC3YFK6Z5K4TMHVABTJ0
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
20.7s
Cost:
$0.027665
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.6894%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the table, smoothing out a wrinkle that had been folded directly through the bouncy castle. "Right," she said, tapping the paper. "The objective is the present table. Northeast corner of the backyard. We're talking premium loot, Bill. Unwrapped potential." Bill squinted at the map, which had been intercepted from a recycling bin outside the birthday boy's house. A child's hand had labeled everything in glitter glue, including a smiling sun in the corner that contributed nothing to their reconnaissance. "And the security?" he asked. "Parents. Roughly fifteen of them. Plus one clown." He stiffened. "You didn't mention a clown." Molly waved this off and pulled a packet of party hats from her bag. She'd purchased them at a discount store, along with two whistles and a deeply suspicious quantity of streamers. Disguises, she explained, were everything. Nobody questioned an adult at a children's party as long as that adult looked appropriately exhausted and emotionally defeated. "We blend in," she said. "We mingle. We compliment someone's potato salad. Then, during the piñata—" "That's our window," Bill finished, nodding slowly. "That's our window. Every adult in that yard will be watching a blindfolded six-year-old swing a stick. Nobody watches the presents. Nobody ever watches the presents." Bill considered this. He had been in the business twelve years, had cracked safes that hummed and bypassed alarm systems that whispered his name in their dreams. And now here he was, planning a tactical extraction around a cardboard donkey full of candy. "What about the loot itself," he said. "We can't exactly walk out with twenty wrapped boxes." "Garbage bags. Large, contractor-grade. We tell anyone who asks that we're collecting trash for the host. People love a volunteer. People trust a man with a garbage bag and good posture." He had to admit it was elegant. "And extraction?" "Minivan," Molly said, with the reverence of a woman naming a beloved pet. "Sliding doors. Optimal cargo space. We pull up to the curb, load the bags, and we're gone before anyone's finished singing the birthday song." Bill rubbed his chin. There was, he felt, one outstanding issue. "The presents," he said carefully, "are for children." Molly stared at him. "Bill." "I'm just saying. Morally. As a concept." "Bill, they're seven. They don't even understand object permanence." "I'm fairly certain they do by seven." She rolled up the map and pointed it at him like a baton. "These children will receive love, cake, and the lifelong memory of a magical day. We will receive forty unopened toys with excellent resale value. Everyone wins. It's redistribution." Bill was quiet for a moment, weighing his entire ethical framework against a tube of construction paper. "Fine," he said. "But if that clown so much as looks at me, I'm leaving you behind." Molly grinned and tossed him a party hat. He caught it, examined the cheerful elastic strap, and sighed the long sigh of a man who had peaked, professionally, somewhere around last Tuesday. "To the minivan," she said.
created_at
"2026-05-31T11:44:39.05Z"
5/31/2026, 11:44:39 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1139
moderation_latency
(null)
generation_time
20703
tokens_prompt
77
tokens_completion
823
native_tokens_prompt
93
native_tokens_completion
1088
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.027665
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01FDBXq9SndvbgyNkeK3iB6g"
is_byok
false
latency
1139
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227879-bUzB13MeNuHcVV35EQeL"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227879-GAircuIzMrzgbfjLvEr0"
upstream_id
"msg_01FDBXq9SndvbgyNkeK3iB6g"
total_cost
0.027665
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
505
0.00%
Dialogue to Total Word Ratio
Ratio: 44.79%, Deviation: 14.79%
neededClean
false
wordsTotal
509
wordsDialogue
228
49.6894%