NC Bench
Overview
About
Tests
Models
Model
GPT-4.1 Mini
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2025-04-14-01JRTSVQ3EBZM0KH63SQW2WPQ0
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the bushes, peeking at the brightly colored house across the street. Bill shuffled nervously beside her, holding a makeshift blueprint of the yard they'd drawn on the back of a pizza box. "So, remind me why we’re robbing a children’s party," Bill whispered, eyes darting at the balloon archway like it was a security camera. "Because, Bill," Molly said, grinning, "innocent children have the best loot. Presents. Wrapped. Untouched. Pristine." She jiggled the pizza box sketch. "Look at this plan. Simple." Bill frowned. "Simple? It looks like a finger painting by a kid who hates shapes." "Details, details. We sneak past the door, duck under the table, grab the gifts, and skedaddle." Molly’s voice dropped to a sinister whisper. "Easy." "I don’t know," Bill muttered, glancing over his shoulder as a pack of sugar-crazed kids zoomed past them. "Kids are small but fast. And that clown—I swear he has security clearance or something." Molly pulled a small flashlight from her pocket and flipped it on. "That clown’s a joke. Literally. He keeps tripping over his own oversized shoes." "Yeah, until he bops you with a balloon animal," Bill warned, squinting at the house. "Also, why are we robbing a party? Don’t we usually go for banks? Or art galleries?" "Too many cameras. Too many guards. Plus, I’ve got a thing for glittery wrapping paper." Molly tapped the pizza box. "And who wants boring jewel theft when you can have sugar rushes and tiny toy guns?" Bill sighed. "Alright, so what’s the exact route? I don’t want to be caught holding a gift bag full of SpongeBob pajamas." Molly traced a crooked path with her finger. "We enter through the side gate—that creaky one." She paused for dramatic effect. "Bill distracts the party by pretending to be hired entertainment." Bill choked on his breath. "Me? Entertainment? I can’t even juggle oranges." "That’s why you’re the ‘Great Billini’—master of illusion and occasional magician. You’ll dazzle them. Then I slip in through the kitchen, grab the loot, and back out." Bill groaned. "Last time you said ‘dazzle’ I ended up getting sprayed with whipped cream." Molly shrugged. "Hence the whipped cream proof vest." She peeled back her jacket to reveal what looked like a vest made from plastic bubble wrap. "Stylish," Bill said dryly. Suddenly, a tiny voice shouted, "Hey, look! It’s the ninja spies!" Molly and Bill turned to see a group of kids pointing at their hiding spot. "Abort mission!" Bill hissed. Molly smiled. "Too late. They already know. Time to improvise. Ready?" Bill nodded reluctantly. "Ready for what—running for our lives?" "Exactly." Molly stood up and waved cheerily. "Hi, kids! Want to see some magic?" Bill sighed and followed, muttering, "I swear, if I end up in a clown costume, I’m blaming you." Molly laughed. "Deal. But first, help me grab those presents." Bill glanced back at the house, the gleam of wrapped boxes visible through the window. "Alright, Great Billini, let’s make this the best heist ever... or at least the messiest." "Messy is fun," Molly said, skipping toward the gate. "And remember, glitter is our friend." Bill groaned again. "This is why I hate glitter." "Love you too, Bill."
id
"gen-1744656260-M9n5bjZqT2ZOgcnEujNo"
upstream_id
"chatcmpl-BMJ52WWo36P3jIq2FULrs8ue95VQS"
total_cost
0.0012668
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:44:30.371207+00:00"
Monday, April 14th, 2025 at 6:44:30 PM GMT+00:00
model
"openai/gpt-4.1-mini-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
496
moderation_latency
79
generation_time
8221
tokens_prompt
68
tokens_completion
775
native_tokens_prompt
67
native_tokens_completion
775
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0012668
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
538
0.8401%
Dialogue to Total Word Ratio
Ratio: 60.22%, Deviation: 9.78%
neededClean
false
wordsTotal
538
wordsDialogue
324
0.4200%