NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZZ3P3X19MXZGPECTNCWC03
Run Details
Model:
o4 Mini High
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
1m 54s
Cost:
$0.050479
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
68.3940%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouch behind a dense hedge near the brightly lit house. Tinsel glints around the front window quietly. Molly: "Okay, Bill, what's the plan for tonight?" Bill: "We sneak through the back gate, disable the alarm, then slip into the living room." Molly: "Back gate creaks like a haunted mansion. We need a distraction." Bill: "I'll hire a mechanical kitten puppet and send it in first." Molly: "Everyone loves kittens. They coo and look away." Bill: "While they're distracted we scoop up the presents." Molly: "Don't forget the confetti cannon." Bill: "Right. Sparkly chaos equals perfect cover." Molly: "What about the clown in the corner?" Bill: "I'll dress as the birthday mime. No one suspects silent balloon." Molly: "Brilliant. While you're miming I'll slip past the cake table." Bill: "Got a grappling hook hidden in my pocket." Molly: "After the swap signal we race to the getaway van." Bill: "Parents will check morning. Too late by dawn." Molly: "I'm nervous. Kids or sugar highs equal chaos." Bill: "Relax. Stick to the script and we'll be golden." Molly: "Okay. Let's move before the piƱata starts." Bill: "Remember stealth over speed. Ninja vibes." Molly: "Got it. Shh." Bill: "Go quick."
id
"gen-1744829536-ziLisALvEuNJMgYW431r"
upstream_id
"chatcmpl-BN29oxeIzVS5bg3Pn85xoBLaU9S5q"
total_cost
0.050479
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:54:10.872136+00:00"
4/16/2025, 6:54:10 PM
model
"openai/o4-mini-high-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
110617
moderation_latency
133
generation_time
113410
tokens_prompt
68
tokens_completion
11447
native_tokens_prompt
66
native_tokens_completion
11456
native_tokens_reasoning
11136
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.050479
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
36.79%
Dialogue to Total Word Ratio
Ratio: 80.00%, Deviation: 10.00%
neededClean
false
wordsTotal
200
wordsDialogue
160
68.3940%