NC Bench
Overview
About
Tests
Models
Model
GPT-4.1
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2025-04-14-01JRTSTJFC1HNT6ZSE4WMYNG5E
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill peered through the window of the brightly decorated house. Streamers twined around the banister like neon snakes, and a massive piñata in the shape of a unicorn hung above a table crowded with wrapped boxes. "I can't believe we're doing this," Molly whispered, crouching beside him. "Stealing birthday presents? From kids, Bill?" "Not from kids,” Bill hissed. “From the world’s most pretentious six-year-old. You heard how she bragged at the bakery. ‘All my presents are gold-wrapped because I deserve only the best!’” Molly rolled her eyes. “You’re jealous of a kindergartner’s party favors?” “Look at them! Those bows have more rhinestones than my ex-wife’s wedding dress.” “Your ex-wife’s wedding dress had feathers sewn on the hem. Your taste is questionable.” “Hey, I picked quality feathers! Anyway—” “Focus!” Molly hissed. “The parents are everywhere. Are we using the ‘clown distraction’ plan or the ‘cake mix-up’?” Bill scrunched his face. “Clown plan. You can’t mess up a clown.” Molly snorted. “You, Bill, have never tried clown makeup on a deadline.” “Look, I can stuff the presents in my inflatable-clown pants—” “You’ll waddle. You already walk like a penguin with tennis elbow.” “Then it’s on you. Can you handle the juggling routine?” “Juggling? Juggling what, the presents?” “No! I brought fake bowling pins. Distraction!” Molly shook her head. “You, in a wig, flinging plastic batons. What could go wrong?” “They’ll laugh! I’m a natural.” “More like a natural disaster.” Bill straightened, puffing out his chest. “Listen, you distract the kids with balloon animals, I swipe the loot. Teamwork.” “I don’t know how to make balloon animals. The last time I tried, I gave a toddler a balloon ‘dog’ and she screamed it looked like her grandma’s varicose veins.” “Fine. New plan—silent infiltration.” “Oh, sure, Bill. We just tiptoe and steal a dozen garishly wrapped boxes in plain sight of thirty sugared-up kids. Genius.” “We wait for the piñata! Frenzied chaos, kids rush for candy, we slip the presents out the back.” “You think nobody’s watching the presents during the piñata smash?” “Parents will be gossiping. Trust me, I scoped the situation. Last year, Mrs. Marsh’s hot dish exploded and everyone missed the magician’s ferret escaping.” “You think a casserole can buy us six minutes of distraction?” “If we sabotage the snack table, it might.” Molly sighed, rubbing her forehead. “I left a safecracker kit in the van for this?” Bill grinned. “We’ll improvise. You cause a ruckus with the Sprite and Mentos, I handle loading.” “I’m not blowing Mentos Volcano in front of a birthday cake. The last thing I need is a frosting explosion on my record.” “Then you’d better perfect that balloon giraffe, partner.” “Why am I always on balloon duty?” “Because you’re the only one with nimble fingers.” “I broke three nails picking your handcuffs last week.” “We’ll get you new nails—with all the presents we snatch.” Molly groaned. “If we get caught by a pack of six-year-olds, I’m blaming the clown pants.” “Deal. Ready?” “Ready as I’ll ever be. Operation Unicorn Heist is a go!”
id
"gen-1744656223-HflzOzGOnIC7xHHgzv8X"
upstream_id
"chatcmpl-BMJ4RLd6mGn8UL4iPDg2w9jQr2Awt"
total_cost
0.006126
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:43:52.600202+00:00"
Monday, April 14th, 2025 at 6:43:52 PM GMT+00:00
model
"openai/gpt-4.1-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
390
moderation_latency
147
generation_time
8260
tokens_prompt
68
tokens_completion
750
native_tokens_prompt
67
native_tokens_completion
749
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.006126
Evaluation details
Result
Evaluator
Details
Meta Data
43.3783%
Matches word count
n/a
neededClean
false
words
517
0.0000%
Dialogue to Total Word Ratio
Ratio: 85.69%, Deviation: 35.69%
neededClean
false
wordsTotal
517
wordsDialogue
443
21.6892%