NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZYXCHS5HGNT8BD6AFACYAF
Run Details
Model:
o4 Mini High
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
1m 42s
Cost:
$0.054615
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
100.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind an inflatable castle wall, surveying a party full of sugar-fueled mayhem. Balloons danced above glittered hats. She clutched her crowbar like a conductor’s baton. Across the yard Bill crouched by a stack of gleaming gift boxes. He adjusted his night vision goggles with exaggerated care. The DJ thundered a polka remix. Nearby, toddlers performed a conga line around a table of cupcakes. She and Bill wore matching lederhosen, wigs a size too small. Nobody noticed two thieves plotting under colorful streamers. A birthday song blared just as they exchanged silent nods. Tonight’s loot was finally within reach. "Molly: The cake distraction is set. Kids flock around frosting." "Bill: Good. You hit the clown with the cream pie at minute two. I’ll grab the big bags." "Molly: What about the toddlers? They might squeal." "Bill: Feed them lollipops. They’ll follow sugar to the backyard." "Molly: Then we scoop the presents and vanish in the fog machine smoke." "Bill: Grab all ribbons and stuff them in duffels. No strip of wrapping stays behind." "Molly: Got it. DJ has our hack queued to play our exit tune." "Bill: Two minutes to landing zone. Countdown starts… now." "Molly: Then we go."
id
"gen-1744829330-zdzecoxXZT8S7ByqcScV"
upstream_id
"chatcmpl-BN26UC2Ph3SM6lU1i4S1wBAsUGJLG"
total_cost
0.054615
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:50:32.336164+00:00"
4/16/2025, 6:50:32 PM
model
"openai/o4-mini-high-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
99020
moderation_latency
98
generation_time
101695
tokens_prompt
68
tokens_completion
12377
native_tokens_prompt
66
native_tokens_completion
12396
native_tokens_reasoning
12096
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.054615
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
100.00%
Dialogue to Total Word Ratio
Ratio: 49.75%, Deviation: 0.25%
neededClean
false
wordsTotal
201
wordsDialogue
100
100.0000%