NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXQTSEMXXRVR1QERT5289W
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
11.0s
Cost:
$0.015190
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
98.95%
Dialogue to Total Word Ratio
(details)
99.4679%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedge, binoculars trained on the suburban fortress disguised as a bouncy castle. Streamers fluttered like enemy banners. Somewhere inside, a clown was juggling, which she found deeply unsettling. Bill unrolled the blueprints across the grass, pinning the corners with juice boxes he'd liberated earlier as reconnaissance. The layout was brutal: a gauntlet of folding tables, a piƱata strung like a tactical landmine, and at least fourteen sugar-crazed seven-year-olds patrolling the perimeter. "The presents," Molly whispered, pointing. "Stacked by the gift table. Unguarded." Bill shook his head gravely. He produced a diagram of a small girl named Tabitha, the birthday queen herself, who reportedly possessed the bite radius of a wolverine and the screaming capacity of a smoke alarm. They had timed it perfectly. Cake distribution created a window of chaos roughly ninety seconds long, during which adult supervision collapsed entirely and every child entered what Bill called "the frosting trance." He checked his watch. He'd disguised himself as a magician; Molly wore a face-painter's apron, balloon animals dangling from her belt like grenades. "On my signal," he murmured. The candles ignited. Twelve voices began the dreaded birthday hymn. Molly tensed, gripping her tote bag, ready to plunder paradise.
created_at
"2026-05-31T11:47:05.148Z"
5/31/2026, 11:47:05 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3407
moderation_latency
(null)
generation_time
10923
tokens_prompt
77
tokens_completion
406
native_tokens_prompt
93
native_tokens_completion
589
native_tokens_completion_images
(null)
native_tokens_reasoning
58
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.01519
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01A4MxKaNQ5BmWzSkQuzBv5h"
is_byok
false
latency
1061
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780228025-sB5yYM8BTT25StREOftm"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780228025-Hwb1u4dLuARCEqxjefZx"
upstream_id
"msg_01A4MxKaNQ5BmWzSkQuzBv5h"
total_cost
0.01519
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
202
98.95%
Dialogue to Total Word Ratio
Ratio: 6.80%, Deviation: 3.20%
neededClean
false
wordsTotal
206
wordsDialogue
14
99.4679%