NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZZ51M6AS53JRQFF3CSMN2M
Run Details
Model:
o4 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
1m 8s
Cost:
$0.023269
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
100.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
A swollen punch bowl bubbled under a flickering ceiling light as Molly crouched behind a stack of party chairs. Her gloves brushed the edge of a tablecloth patterned with dancing giraffes. She tapped a small earpiece and signaled Bill. "Are the disguises ready? The cupcakes smell like impending disaster." "I’ve got glitter bombs and a loud squeaky toy jammer." He hovered near a tower of pastel cupcakes. Streamers fluttered in an unseen draft, casting weird shadows against the wall. A group of toddlers clustered around a magician in a floppy hat. Balloons wobbled. Every parent’s polite chatter became a background thrum beneath adrenaline pulses. Molly pictured catapulting presents into burlap sacks before anyone noticed. Bill’s expression was as serious as a clown preparing for a punchline. Then a distant squeak from a jack-in-the-box sent them both freezing. With slow deliberation they resumed creeping across the linoleum floor. Tiny plastic trinkets and rainbow ribbons lay scattered. A caterpillar balloon animal bobbed near the door handle. In every gift bag they glimpsed shiny paper begging to be pocketed. They shared a determined grin. Tonight’s haul would be legendary. With synchronized stealth they slid the door open and vanished into the laughing throng.
id
"gen-1744829581-Vr6AMus9tSgV3vhjmQRw"
upstream_id
"chatcmpl-BN2AXZLiYcqrRGWgD3o6SzghVuDYT"
total_cost
0.0232694
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:54:09.334404+00:00"
4/16/2025, 6:54:09 PM
model
"openai/o4-mini-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
63578
moderation_latency
106
generation_time
67335
tokens_prompt
68
tokens_completion
5253
native_tokens_prompt
66
native_tokens_completion
5272
native_tokens_reasoning
4992
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0232694
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
100.00%
Dialogue to Total Word Ratio
Ratio: 9.85%, Deviation: 0.15%
neededClean
false
wordsTotal
203
wordsDialogue
20
100.0000%