NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-08-01KGYYHE8ASH2XRKZBQSSEP7C0
Run Details
Model:
Claude Opus 4.6
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
10.6s
Cost:
$0.008150
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
72.56%
Dialogue to Total Word Ratio
(details)
81.5211%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"So the party starts at two," Molly said, spreading a crayon-drawn map across the dashboard. "We go in at two-fifteen, right when the sugar hits." "Why two-fifteen?" Bill asked, adjusting his fake mustache in the rearview mirror. "Because twenty kids hopped up on cake and juice boxes is basically a smoke screen, Bill. Nobody's watching anything except their own little nightmare." "Fair point. What about the mom?" "Karen? She'll be in the kitchen dealing with the gluten-free kid's special cupcake situation. I made sure to RSVP little Timmy with seven allergies." "There is no little Timmy." "Karen doesn't know that. Now, you go in through the bouncy castle." "I'm sorry, through it?" "There's a seam on the left side. You slip in, crawl under, pop out by the gift table." "And I just grab everything?" "You load the presents into the giant teddy bear." "What giant teddy bear?" "The one I hollowed out last night. It's in the trunk." Bill turned around and stared at the enormous stuffed animal wedged into the backseat. "That's deeply unsettling, Molly." "It's genius is what it is. Now put on your balloon artist wig."
created_at
"2026-02-08T15:39:58.803409+00:00"
2/8/2026, 3:39:58 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1976
moderation_latency
(null)
generation_time
10585
tokens_prompt
61
tokens_completion
260
native_tokens_prompt
75
native_tokens_completion
311
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00815
router
(null)
provider_responses
0
id
"msg_01T3g4rF9xX4T6qDfDNye4GY"
status
200
is_byok
false
latency
1976
endpoint_id
"c926c048-563e-44b0-8434-95965bee924c"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770565187-TiicmJipdkNYwXdX3IRr"
upstream_id
"msg_01T3g4rF9xX4T6qDfDNye4GY"
total_cost
0.00815
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
190
72.56%
Dialogue to Total Word Ratio
Ratio: 82.47%, Deviation: 7.53%
neededClean
false
wordsTotal
194
wordsDialogue
160
81.5211%