NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZYYJSMJYFX7VG63CFX1ZM1
Run Details
Model:
o4 Mini High
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
1m 26s
Cost:
$0.034586
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
50.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouch behind a rainbow unicorn bounce house under a cloudless afternoon sky. Streamers flutter and children shriek with sugary glee as a line of pastel balloons stretches toward a table groaning under presents wrapped in glitter and cartoon characters. The duo checks their wrist watches and exchanges nods over gloved hands. Bill grips a candy cane hook while Molly balances a toolbox full of miniature gadgets. They share a mischievous grin and rehearse a silent countdown. Soon they will transform this cheerful gathering into their own holiday heist stage, liberating every gift under the twinkling party lights. Molly: The cake table is a treasure trove of goodies. Bill: We swoop in disguised as entertainers vanish under chaos. Molly: I’ve rigged bags with magnets to snag shiny gifts. Bill: What about wrap paper alarms? We short circuit them. Molly: I’ll distract clown using confetti cannon, you extract loot. Bill: Parents will chase darts while we excavate gift mountain. Molly: Don’t swipe piñata loot until kids distracted by cake. Bill: We’ll load the loot into oversized party balloon prototype. Molly: Timing vital; adults swarm balloons, we slip away fast. Bill: Once airborne we vault presents into sister’s van nearby.
id
"gen-1744829369-p7zwaqTRGEryrQP1HoO8"
upstream_id
"chatcmpl-BN277IMI7mEph47ozHmQ6KgSyLS3A"
total_cost
0.0345862
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:50:55.655684+00:00"
4/16/2025, 6:50:55 PM
model
"openai/o4-mini-high-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
81717
moderation_latency
141
generation_time
85812
tokens_prompt
68
tokens_completion
7825
native_tokens_prompt
66
native_tokens_completion
7844
native_tokens_reasoning
7552
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0345862
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
200
wordsDialogue
0
50.0000%