NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSK7W28C8NSRRSWSRG7AQ41
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
10.4s
Cost:
$0.012865
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9995%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the table, knocking over a juice box. "The bounce castle is our entry point. We go in through the inflatable drawbridge." "That's just the front door," Bill said, squinting. "It's a bounce castle, Molly. Kids bounce in it." "Exactly. Nobody suspects a grown man bouncing." Bill rubbed his temples. "And the presents? Where are they?" "Gift table. Northeast corner, guarded by a clown named Sprinkles." She tapped the diagram. "Sprinkles is the wildcard." "You've been casing a clown?" "For three days. He takes a smoke break at two o'clock, right after the piƱata." Bill leaned back, arms crossed. "What about the parents?" "Distracted. There's cake." Molly's eyes gleamed. "Never underestimate the power of cake, Bill." "This is insane. We're stealing from children." "We're stealing from rich children," she corrected. "There's a difference. Their presents are basically luxury goods." Bill sighed, then pointed at a red circle on the map. "What's this?" "Goodie bags. We grab those on the way out. Bonus loot." "You're unbelievable." "I'm thorough." She rolled up the blueprints triumphantly. "Now, suit up. And Bill?" "What?" "Wear something festive. We need to blend in." Bill groaned. "I am not wearing the dinosaur costume again."
created_at
"2026-05-29T10:07:24.241Z"
5/29/2026, 10:07:24 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1472
moderation_latency
(null)
generation_time
10015
tokens_prompt
77
tokens_completion
349
native_tokens_prompt
93
native_tokens_completion
496
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.012865
router
(null)
provider_responses
0
endpoint_id
"dfc0e5bd-d703-4fe2-a7bb-655eb95d5441"
id
"msg_01BzMp2737A9jvMb98R33bqg"
is_byok
false
latency
1464
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780049244-hxxyreEY3zXAfWcQP0S1"
session_id
(null)
api_type
"completions"
id
"gen-1780049244-dgQb1Dheg86NLXNBqhg7"
upstream_id
"msg_01BzMp2737A9jvMb98R33bqg"
total_cost
0.012865
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
201
0.00%
Dialogue to Total Word Ratio
Ratio: 74.13%, Deviation: 24.13%
neededClean
false
wordsTotal
201
wordsDialogue
149
49.9995%