NC Bench
Overview
About
Tests
Models
Model
o4 Mini High
Test
Dialogue tags
Scenario
Write 200 words with 90% dialogue
Run ID
2025-04-16-01JRZZB2QYSA6FGW4YE3Y3XR5R
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Night. Backyard. A birthday banner flutters. Molly and Bill crouch behind a hedge. Presents glint under strings of fairy lights. Molly: Are you ready to snatch every gift before the sugar crash? Bill: I was born ready. Got the toy sack and the stealth sweets floss. Molly: In twenty seconds the piñata distraction kicks in. Bill: I’ll jingle the party hats and set the runaway balloon free. Molly: While the kids chase it, we hit the table. Bill: Plushies first, then Lego bricks, stuffed unicorns and remote cars. Molly: Don’t forget the glitter bombs disguised as party poppers. Bill: Glitter is the new security system—once it sticks, no turning back. Molly: Slide across that inflatable castle, aim for the present pile. Bill: Understood. Avoid the toddlers—they’ll cry if you bump them. Molly: Compassion optional. Speed mandatory. Bill: All right. Operation Gift Grab starts now—three, two, one! Molly: Remember the escape route through the snack table? Fewer witnesses by the punch bowl. Bill: That route’s perfect. After we loot the gifts we vanish before the cake cutting. Molly: And if the parents appear, we trigger a glitter bomb avalanche and dash into the pond maze. Bill: Showtime, partner. Begin, now! Go!
id
"gen-1744829778-pjNEiBJYLPPqBgd40n0E"
upstream_id
"chatcmpl-BN2DjlT9nmNLqcwE8yg9MOGNBDbZ6"
total_cost
0.0209066
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:57:04.209459+00:00"
Wednesday, April 16th, 2025 at 6:57:04 PM GMT+00:00
model
"openai/o4-mini-high-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
42810
moderation_latency
88
generation_time
44792
tokens_prompt
68
tokens_completion
4716
native_tokens_prompt
66
native_tokens_completion
4735
native_tokens_reasoning
4416
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0209066
Evaluation details
Result
Evaluator
Details
Meta Data
99.9190%
Matches word count
n/a
neededClean
false
words
203
0.0000%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 90.00%
neededClean
false
wordsTotal
203
wordsDialogue
0
49.9595%