NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2026-02-08-01KGYX7MPG0FATC1KRG20XYP07
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the minivan, weighing down the corners with a juice box and three stale goldfish crackers she'd found wedged in the cup holder. The map, rendered in colored pencil on the back of a Denny's placemat, depicted the Hargrove residence in painstaking detail, right down to the inflatable bounce house that would be erected in the backyard at approximately noon. Bill leaned over her shoulder, breathing heavily through his mouth in a way that made Molly want to commit a second crime. "So the presents will be on a table near the sliding glass door," Molly said, tapping the placemat. "Mrs. Hargrove always sets up a gift station. I've been monitoring her Pinterest board for three weeks." "Three weeks of surveillance on a seven-year-old's birthday party," Bill said. "This is either the most professional job we've ever pulled or the saddest." "It's both. That's our brand." Bill scratched his chin, which had been poorly shaved in what appeared to be total darkness. He studied the map with the intensity of a man who could not actually read maps, turning it sideways, then upside down, then back again. The plan was elegant in its simplicity. They would arrive at 12:15, during the chaos window between the bounce house opening and the pizza delivery, when parental attention would be at its absolute lowest. Molly had calculated this window using data from four previous Hargrove gatherings she'd attended under the alias of a concerned neighbor named Brenda. "I go in through the side gate dressed as a clown," Bill said. "Party entertainer. You're the scheduled entertainment." "Right, and the real clown?" "I sent him to an address in Reseda. He'll figure it out eventually, but not before we're in and out." Bill pulled a red foam nose from his pocket and squeezed it experimentally. It let out a sound like a dying accordion. Molly closed her eyes and breathed through a moment of profound regret. The gifts, according to Molly's intelligence, would include at least two gaming consoles, several high-end LEGO sets, and a drone that the birthday boy had demanded with the ferocity of a tiny warlord. Conservative estimates put the haul at eighteen hundred dollars, retail. Not exactly retirement money, but enough to cover the rent Bill owed his landlord and the vet bills for Molly's cat, who had expensive taste in intestinal parasites. "What if a kid talks to me?" Bill asked. "Kids make me nervous." "You'll be in clown makeup. They'll be more scared of you than you are of them." "That's genuinely not possible." Molly folded the placemat and tucked it into her fanny pack, which she wore without irony, which somehow made it worse. The sun was climbing over the strip mall across the street, casting golden light over the parking lot where two adults in their late thirties were preparing to rob children. "We go at noon," Molly said. "Synchronize watches." Bill held up his bare wrist. "Buy a watch, then synchronize it."
created_at
"2026-02-08T15:17:21.113208+00:00"
Sunday, February 8th, 2026 at 3:17:21 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1603
moderation_latency
(null)
generation_time
22306
tokens_prompt
61
tokens_completion
629
native_tokens_prompt
75
native_tokens_completion
732
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.018675
router
(null)
provider_responses
0
id
"msg_01Ge3ZsRymTsGZuAUfTUDCKv"
status
200
is_byok
false
latency
1603
endpoint_id
"c926c048-563e-44b0-8434-95965bee924c"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770563818-cNwOr3c7WAAhhx4aNFbX"
upstream_id
"msg_01Ge3ZsRymTsGZuAUfTUDCKv"
total_cost
0.018675
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
93.6496%
Matches word count
n/a
neededClean
false
words
509
86.4094%
Dialogue to Total Word Ratio
Ratio: 28.29%, Deviation: 1.71%
neededClean
false
wordsTotal
509
wordsDialogue
144
90.0295%