NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2026-02-08-01KGYZ6GDXET1WDPZ3Q91VA4X0
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the sticky table at Denny's, knocking over a small tower of creamer cups that Bill had been constructing for the past fifteen minutes. He looked genuinely heartbroken about it but said nothing, choosing instead to focus on the crayon-scrawled blueprint before them. The map depicted a suburban house in alarming detail. There were tiny stick figures in the backyard, a rectangle labeled BOUNCE HOUSE in red crayon, and a massive X over the garage with the words PRESENT TABLE written in Molly's aggressive capital letters. "The party starts at two," Molly said, tapping the garage with a french fry. "That gives us a window between two-fifteen and two-thirty, when the magician arrives. Every kid, every parent, every sticky little face will be glued to that man pulling rabbits out of hats." Bill leaned forward and squinted. "Where did you get this intelligence?" "Facebook. The mom posted everything. The theme, the guest list, the timeline. She even posted which gifts people were bringing. We're looking at a KidKraft kitchen set, two LEGO Star Destroyers, and at least four Squishmallows." Bill whistled low and long, the sound of a man who understood the resale value of unopened LEGO sets. The plan was elegant in its simplicity, which worried Bill because Molly's plans were never simple and rarely elegant. She outlined the approach while he ate his Grand Slam breakfast. They would arrive disguised as balloon artists, a cover Molly had been rehearsing for three days in her apartment. She could now make a passable sword and something she called a dog but which looked more like a tumor with legs. "We walk in through the side gate," she continued, "set up near the bounce house, and wait. When the magician starts his act, we load the presents into the balloon bags and walk right out the front door." "What if someone asks why the balloon people are leaving with giant lumpy bags?" "We tell them we're bringing in the special balloons from the van." "We don't have a van." "We'll have a van by Saturday." Bill did not ask how. He had learned that asking Molly follow-up questions about vehicle acquisition only led to anxiety and occasional insomnia. She folded the map with surgical precision and tucked it into her jacket pocket, right next to a package of Rolaids and what appeared to be a small dog made of balloons. The tumor dog. She carried it like a talisman. "The biggest risk is the bounce house," she said, her voice dropping. "Kids go feral in those things. If one of them escapes and spots us near the presents, we're done. Children have no loyalty but they have incredible lungs." Bill nodded solemnly. He had once been identified in a lineup by a six-year-old witness and had never fully recovered from the humiliation. "So we're really doing this," he said. Molly smiled the way she always did before something went terribly wrong. "Happy birthday to us."
created_at
"2026-02-08T15:51:38.483968+00:00"
Sunday, February 8th, 2026 at 3:51:38 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1842
moderation_latency
180
generation_time
19392
tokens_prompt
61
tokens_completion
617
native_tokens_prompt
75
native_tokens_completion
704
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.017975
router
(null)
provider_responses
0
id
"31a879b3-a4a2-4010-81b1-a86a0f0e3297"
status
200
is_byok
false
latency
1841
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770565878-CbOGs92BkDrnD52QORoM"
upstream_id
"31a879b3-a4a2-4010-81b1-a86a0f0e3297"
total_cost
0.017975
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
98.7124%
Matches word count
n/a
neededClean
false
words
506
1.5479%
Dialogue to Total Word Ratio
Ratio: 39.13%, Deviation: 9.13%
neededClean
false
wordsTotal
506
wordsDialogue
198
50.1302%