NC Bench
Overview
About
Tests
Models
Model
o4 Mini
Test
Dialogue tags
Scenario
Write 200 words with 90% dialogue
Run ID
2025-04-16-01JRZYYNV1NQ3243YQDNW2GMA8
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Under flickering streetlights, Molly and Bill crouch behind a trash bin outside a suburban house. Their target: children's birthday party. Molly: "Alright Bill, you ready to snatch every glittery wrapped box before the bouncy clown notices?" Bill: "Snatch? I was thinking nick, swipe, purloin, but snatch works." Molly: "Good, our heist vocabulary is as versatile as our sock drawer." Bill: "We need disguises. I'm in full construction hat mode." Molly: "Or we could pose as clumsy magicians. Confuse kids with glitter bombs." Bill: "Glitter bombs attract every parent within a two-mile radius. Too risky." Molly: "Fine. Babysitter cover. I'll rent octopus arms to handle multiple gifts." Bill: "Do octopus arms come with nappies and lullaby playlists?" Molly: "Probably optional accessories. We need an exit strategy, too." Bill: "Roof slide? Party tent tarp to skid down lawn?" Molly: "Nice. Then speed scooter getaway through sprinklers." Bill: "What about ninjas? Smoke bombs?" Molly: "We'd be outclassed by chocolate cake smears. Stick to plan." Bill: "Plan: Babysitter octopus, roof slide, speed scooter. Got it." Molly: "Perfect. Let's grab those presents like sugar-crazed pirates." Bill: "Sugar-crazed pirates with impeccable timing. Showtime." Molly: "Ready your seven-pocket utility vest. I'll pack candy wrappers for evidence suppression."
id
"gen-1744829372-heG8g2H83WOet20BvNgT"
upstream_id
"chatcmpl-BN27AVZ7FJvw7Vtc3cK2MZzudyWks"
total_cost
0.0143286
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:50:04.135678+00:00"
Wednesday, April 16th, 2025 at 6:50:04 PM GMT+00:00
model
"openai/o4-mini-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
28328
moderation_latency
97
generation_time
31123
tokens_prompt
68
tokens_completion
3226
native_tokens_prompt
66
native_tokens_completion
3240
native_tokens_reasoning
2880
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0143286
Evaluation details
Result
Evaluator
Details
Meta Data
99.7443%
Matches word count
n/a
neededClean
false
words
204
64.5041%
Dialogue to Total Word Ratio
Ratio: 81.86%, Deviation: 8.14%
neededClean
false
wordsTotal
204
wordsDialogue
167
82.1242%