NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZYWYVMMG3NNMHKJ09WEYQV
Run Details
Model:
o4 Mini High
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
1m 57s
Cost:
$0.052833
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
53.31%
Dialogue to Total Word Ratio
(details)
76.6566%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouch behind a rosebush near the garden gate. Streamers flutter above them in the moonlight. Molly: "Check the address on the invitation again, Bill." Bill: "Second house on the left, bright pink balloons out front." Molly: "Perfect. Kids down the block distracted by pinata." Bill: "Time to don our disguises and snatch those presents." Molly: "You got balloons? We need decoys for our getaway." Bill: "I’ve got five red ones tied to my belt." Molly: "Brilliant. Parents always watch kids, never clowns." Bill: "I’m the quiet clown. You’re the one with jokes." Molly: "I’ll crack jokes at the cake table distraction." Bill: "I'll slip past twisting streamers and reach the gifts." Molly: "Keep your glove on. No fingerprints on the wrapping." Bill: "My gloves are neon to match my shoes." Molly: "Make me laugh while I disable the security keypad." Bill: "Ha! Code hack successful. Table unlocked in five." Molly: "Grab the big purple box with the gold bow." Bill: "I’m stuffing it into my magic bag." Molly: "Parents heading our way, hurry!" Bill: "On my mark we dash swiftly through the garden." Molly: "We vanish like birthday candles in a gust." Bill: "Operation Gift Grab's a success!"
id
"gen-1744829316-KbfzCzEaPQTMMkrmGL8e"
upstream_id
"chatcmpl-BN26GGeMaXlDrk76Oq5ga79YRI3wr"
total_cost
0.052833
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:50:33.062365+00:00"
4/16/2025, 6:50:33 PM
model
"openai/o4-mini-high-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
113848
moderation_latency
90
generation_time
115951
tokens_prompt
68
tokens_completion
11973
native_tokens_prompt
66
native_tokens_completion
11991
native_tokens_reasoning
11648
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.052833
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
201
53.31%
Dialogue to Total Word Ratio
Ratio: 81.09%, Deviation: 8.91%
neededClean
false
wordsTotal
201
wordsDialogue
163
76.6566%