NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-18-01KHQ93RKCJG1PST97VP9N01GX
Run Details
Model:
Claude Sonnet 4.6
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
8.6s
Cost:
$0.004815
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
99.99%
Dialogue to Total Word Ratio
(details)
84.0484%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Right, so the birthday boy turns six at exactly two o'clock," Molly whispered, unfolding a crayon drawing she'd stolen from the party planner's office. "Is that our blueprint?" Bill squinted at it. "Did a child draw this?" "The party planner's kid drew it, yes. But look, here's the bouncy castle, here's the gift table." "Why is everything a smiley face?" "Focus, Bill." "I am focused. I'm focused on the fact that our heist map has rainbows on it." "The presents get stacked near the garden fence at half two. That's our window." "And we're dressed as clowns because?" "Because nobody questions a clown at a children's party, Bill." "I can think of several reasons children would question a clown." "We go in, we grab the presents, we leave through the side gate." "What if someone recognizes us?" "We're wearing enormous red noses." "My own mother would recognize me, Molly." "Your mother isn't six years old." "Thank God for that." "Are you in or not?" "I'm in. I just want it noted that this is the strangest job we've ever done." "Noted. Now put your wig on."
created_at
"2026-02-18T02:26:29.632061+00:00"
2/18/2026, 2:26:29 AM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
555
moderation_latency
(null)
generation_time
8514
tokens_prompt
77
tokens_completion
298
native_tokens_prompt
75
native_tokens_completion
306
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.004815
router
(null)
provider_responses
0
id
"msg_0121NVf8xt1ncaTZwZHhpR6G"
status
200
is_byok
false
latency
555
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
api_type
"completions"
id
"gen-1771381580-6qMhY2DFdFbrTDaxmGZr"
upstream_id
"msg_0121NVf8xt1ncaTZwZHhpR6G"
total_cost
0.004815
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
186
99.99%
Dialogue to Total Word Ratio
Ratio: 90.86%, Deviation: 0.86%
neededClean
false
wordsTotal
186
wordsDialogue
169
84.0484%