NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1HP384CDMTCGPYMSG337YR
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
10.0s
Cost:
$0.013315
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.93%
Dialogue to Total Word Ratio
(details)
50.3375%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the table, smoothing its crumpled edges. "The bouncy castle is our main obstacle," she said, tapping the wobbly blue rectangle. "Security's tight." Bill squinted. "Security? They're six-year-olds." "Six-year-olds with sugar-fueled reflexes, Bill. Don't underestimate them." He scratched his chin, considering this. "And the presents?" "Stacked by the gift table. We grab them during the piƱata frenzy." She mimed swinging a bat. "Total chaos. Nobody watches the loot." "Brilliant," Bill whispered. "But what about the clown?" Molly's face darkened. "Pickles the Clown. He's the wildcard. Trained in balloon animals and, allegedly, hand-to-hand combat." "You're joking." "I never joke about clowns." Bill swallowed nervously. "So what's the plan if Pickles spots us?" "You distract him." Molly handed Bill a single red nose. "Blend in. Become one of them." "Become a clown?" He held the nose like it was radioactive. "Absolutely not." "It's that or face the parents, Bill. Have you ever seen a mother whose child's gift got stolen? They're feral." He sighed, defeated, and snapped the nose onto his face. "How do I look?" Molly studied him seriously. "Terrifying. The children will weep." "Perfect," Bill muttered. "Let's go ruin a birthday."
created_at
"2026-06-01T12:14:08.648Z"
6/1/2026, 12:14:08 PM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1152
moderation_latency
(null)
generation_time
9940
tokens_prompt
77
tokens_completion
351
native_tokens_prompt
93
native_tokens_completion
514
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.013315
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_018dwLrcBrFEbJHTMMjLmbSe"
is_byok
false
latency
1152
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780316048-zDmTKdlPhF3pfRpmutz7"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780316048-l6DMtxR9tu82ffs7pNT1"
upstream_id
"msg_018dwLrcBrFEbJHTMMjLmbSe"
total_cost
0.013315
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
196
0.93%
Dialogue to Total Word Ratio
Ratio: 64.71%, Deviation: 14.71%
neededClean
false
wordsTotal
204
wordsDialogue
132
50.3375%