NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZYSEBF9GCN76QE53B77NNT
Run Details
Model:
o4 Mini High
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
2m 25s
Cost:
$0.048090
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
97.47%
Dialogue to Total Word Ratio
(details)
98.7362%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Night wraps the quiet suburban backyard in shadows as Molly and Bill crouch behind a row of inflatable animals. Frog and unicorn shapes loom under fawn colored grass. Candles flicker inside the brightly lit house where the children shriek at clumsy clown tricks. Molly glances at Bill’s utility belt bristling with tiny saw blades and wires. Her grin reveals excitement and mild terror. Bill checks his watch, eyes darting. He inhales deeply and taps a soda can, reminding them of their plan. The fence cresting over his shoulder still feels spiky. They must be swift and cunning to snatch treasure. Molly: "Okay Bill we slip through the broken gate disguised as birthday clowns." Bill: "I brought juggling balls loaded with sleeping gas not for kids but for parents." Molly: "Nice. But the plan needs a distraction for the hyperactive Chihuahua named Snickers." Bill: "We rigged the speaker with a barking track and a leaking sprinkler to lure that yipping terror." Molly: "When do we grab gift bags without tumbling hot cocoa or tumbling into balloons?" Bill: "Cake cutting chaos is our perfect cover." Molly: "I noted pricey toys by bright bows." Bill: "We vanish with presents in neon bunny bandanas."
id
"gen-1744829201-iraAWxsPmqjQoF6f9kCI"
upstream_id
"chatcmpl-BN24Q9K8KEMePE1XLQ1kfIBFpjswB"
total_cost
0.0480898
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:49:06.047136+00:00"
4/16/2025, 6:49:06 PM
model
"openai/o4-mini-high-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
140229
moderation_latency
80
generation_time
143739
tokens_prompt
68
tokens_completion
10894
native_tokens_prompt
66
native_tokens_completion
10913
native_tokens_reasoning
10624
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0480898
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
97.47%
Dialogue to Total Word Ratio
Ratio: 46.00%, Deviation: 4.00%
neededClean
false
wordsTotal
200
wordsDialogue
92
98.7362%