NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8X34VMXTFBM5K3J06ZVGR0
Run Details
Model:
Claude Sonnet 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
19.8s
Cost:
$0.012606
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
51.93%
Matches word count
(details)
7.97%
Dialogue to Total Word Ratio
(details)
29.9456%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the hood of the rusty van and smoothed out the corners with her palms. It had been drawn by her nephew two years ago and was originally a picture of a dragon, but with some creative interpretation it served its purpose. "Okay, here's the situation," she said, tapping a squiggly orange line. "The party starts at two. That gives us forty minutes before the cake comes out and everyone gets distracted." Bill squinted at the map. "Is that a tail or a garden hose?" "It's the side gate. Focus." "Right, sorry." He pulled his black beanie down over his ears even though it was seventy degrees outside. "What's our entry point?" "We go over the back fence. There's a bouncy castle in the yard which gives us cover. Nobody looks past a bouncy castle." "That's actually brilliant." Bill nodded with the solemn respect of a man who had never once been brilliant himself. "What about the presents? Where are they usually staged?" Molly tapped the dragon's belly. "Inside table near the front door. Classic placement. Parents pile them there when they arrive so nobody trips over them during pass the parcel." "Pass the parcel," Bill muttered darkly. "Terrible game. You're just unwrapping someone else's disappointment." "Stay on task, Bill." He straightened up and crossed his arms. "What's the haul looking like?" "Twelve kids confirmed on the invite list." She held up a slightly damp piece of paper she had retrieved from a recycling bin outside the family home. "The birthday kid is turning seven. That's prime gift age. We're talking Lego, remote control cars, art sets, the works." "Not dolls though. Last time we got dolls I had nightmares for a week." "We take everything and sort it later." Bill chewed the inside of his cheek. "What about the clown?" Molly paused. "What clown?" "There's always a clown." "There is not always a clown." "At every single children's party I have ever attended in my entire life, Molly, there has been a clown. It is a universal constant, like gravity or bad weather on bank holidays." She stared at him for a long moment. "If there is a clown, we improvise." "I don't improvise well around clowns. You know this." "Then you stay focused, keep your eyes on the presents, and do not make eye contact with the entertainment." She rolled up the dragon map and tucked it under her arm like a general preparing for battle. "We are in and out in under four minutes." Bill looked at the suburban street ahead of them, at the cluster of balloons visible above a garden fence two houses down, at the faint cheerful sound of children shrieking with joy that drifted toward them on the breeze. "You know," he said quietly, "sometimes I wonder if we're the villains in this story." Molly was already walking. "Move your feet, Bill."
created_at
"2026-02-24T22:42:37.052Z"
2/24/2026, 10:42:37 PM
model
"anthropic/claude-4.6-sonnet-20260217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
888
moderation_latency
(null)
generation_time
19695
tokens_prompt
77
tokens_completion
847
native_tokens_prompt
92
native_tokens_completion
822
native_tokens_completion_images
(null)
native_tokens_reasoning
106
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.012606
router
(null)
provider_responses
0
endpoint_id
"6d408764-7dd7-4626-bb87-a6cc1589bc86"
id
"msg_01ShJqyM79rmZ9DbKqGmxtLd"
is_byok
false
latency
885
model_permaslug
"anthropic/claude-4.6-sonnet-20260217"
provider_name
"Anthropic"
status
200
api_type
"completions"
id
"gen-1771972957-ReEOnlxqaPeb45Nqeg7G"
upstream_id
"msg_01ShJqyM79rmZ9DbKqGmxtLd"
total_cost
0.012606
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
51.93%
Matches word count
n/a
neededClean
false
words
484
7.97%
Dialogue to Total Word Ratio
Ratio: 57.11%, Deviation: 7.11%
neededClean
false
wordsTotal
485
wordsDialogue
277
29.9456%