NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZZ52GEGZDRFDB4RHHB94QY
Run Details
Model:
o4 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
1m 56s
Cost:
$0.048838
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
100.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tapped the mirror. Lights and laughter spilled from the house. Bill checked his gloves and watched party photos. "Ready to nab every teddy and puzzle kit?" she whispered. "Born ready. I even packed the stealth glitter," Bill grinned. Molly raised an eyebrow. "Stealth glitter?" He winked. "It’s what criminals use to confuse kids. And parents." The duo crept to the door and twisted the knob. The lock clicked. Muffled giggles drifted out. "Why do kids laugh when they don’t see a face?" Molly wondered. "Pure innocence. Also echo chamber acoustics," Bill said. They entered. Shiny gifts lay on a table. Molly stuffed plush dinosaurs while Bill plucked remote-control cars. Molly checked her watch. "Cake’s gone in ten seconds; then parents swoop," she warned. Bill tapped the table. "I’ve got one more trick up my sleeve." "Got enough loot?" she asked softly. "About two hundred bucks’ worth if they’re lucky brands," he replied. Siren lights flashed. They froze. "Abort mission?" Molly hissed. Bill shrugged. "Let’s swap to plan B: distraction glitter bomb." Molly: "What about the piñata candy? That’s where the real treasure waits." Bill: "I saved it for plan C candy-cannon surprise." They vanished on a comet of glitter.
id
"gen-1744829582-VqgUnmBPeNu51Elqz46e"
upstream_id
"chatcmpl-BN2AYdkal1A9J9JiD2byCSaUYMhFO"
total_cost
0.0488378
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:54:58.316243+00:00"
4/16/2025, 6:54:58 PM
model
"openai/o4-mini-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
112434
moderation_latency
93
generation_time
115635
tokens_prompt
68
tokens_completion
11064
native_tokens_prompt
66
native_tokens_completion
11083
native_tokens_reasoning
10752
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0488378
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
100.00%
Dialogue to Total Word Ratio
Ratio: 50.00%, Deviation: 0.00%
neededClean
false
wordsTotal
202
wordsDialogue
101
100.0000%