NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZZF7ENBB71XC75F47V6BDK
Run Details
Model:
o4 Mini High
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
58.7s
Cost:
$0.031401
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
63.25%
Dialogue to Total Word Ratio
(details)
81.6268%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Night. Backyard under a twinkling bouncy castle. Molly and Bill crouch behind a hedge. They whisper furiously in darkness. Molly: "Okay Bill, step one: locate gift table without tripping on streamers." Bill: "I already tripped over a balloon. Proceeding as planned." Molly: "Excellent tracking. Now don your party hat for camouflage." Bill: "I look ridiculous but I’ll do anything for the loot." Molly: "Remember the kids are armed with foam swords and fossilized cake." Bill: "Armed? We’re outnumbered by sugar-fueled six-year-olds? Perfect." Molly: "Use the sugar high to our advantage during the pinata phase." Bill: "Hypnotize them with cupcakes instead of actual hypnosis." Molly: "More like distract them while we demolish the piñata supply." Bill: "Hope they don’t notice their presents vanish into thin air." Molly: "We’ll replace each gift with empty boxes and a note." Bill: "They’re five. They’ll cry nonstop for days." Molly: "Tough crowd. Bring sock balloons as consolation prizes." Bill: "I hate sock balloons but I love winning heists." Molly: "Now remember: sugar crash signals extraction time." Bill: "Got it. Eat cupcakes, swipe gifts, vanish before cake spews." Molly: "Perfect. Activate vacuum cleaner gadget. Go go go!" Bill: "Roger that. Operation Gift Grab is a go."
id
"gen-1744829914-z0zWmDpjCv5Igji2BPgF"
upstream_id
"chatcmpl-BN2FvdbnGN166GzbNZyedoaALK6WI"
total_cost
0.0314006
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:59:33.783719+00:00"
4/16/2025, 6:59:33 PM
model
"openai/o4-mini-high-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
55446
moderation_latency
106
generation_time
58243
tokens_prompt
68
tokens_completion
7101
native_tokens_prompt
66
native_tokens_completion
7120
native_tokens_reasoning
6784
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0314006
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
63.25%
Dialogue to Total Word Ratio
Ratio: 81.77%, Deviation: 8.23%
neededClean
false
wordsTotal
203
wordsDialogue
166
81.6268%