NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-16-01JRZYT4CTZS9PWE9HQW1B09HG
Run Details
Model:
o4 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
36.1s
Cost:
$0.014126
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
88.29%
Dialogue to Total Word Ratio
(details)
94.1370%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Narrow halls of the community center echoed with muffled laughter and squeals. A disco ball overhead spun glitter across pastel walls decorated with streamers and cartoon characters. A table weighed down by neatly wrapped boxes towered in one corner beside a sleepy DJ booth. As two figures in black breezed past a stack of folding chairs. Their gloves slid against wall tiles. Plans whispered between them sounded like evil lullabies. The smell of frosting drifted through the air, along with spilled punch and confetti. With practiced stealth they approached a window overlooking the party. Their eyes glinted with gleeful mischief. Molly: "The birthday hats are a problem." Bill: "I’ll distract the clown with fake magic." Molly: "You? Last time you turned your sleeve into a dove trap." Bill: "Trust me, this dove is in on it." Molly: "Fine. When I give the signal—three knocks—we dash." Bill: "Got it. Two minutes and I’ll release the giggling goat." Molly: "Goat?" Bill: "Everyone expects a petting zoo. Chaos is our cover." Molly: "Then grab the big box under the table." Bill: "Only if you promise not to eat the cupcakes again." Molly: "Deal." Bill: "Let’s make this the sweetest heist ever." Bill: "Ready?"
id
"gen-1744829223-39z9ay4sN72EKRPLqZ3I"
upstream_id
"chatcmpl-BN24l99mBHSof8VMnRP3ubNp6RFwE"
total_cost
0.0141262
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-16T18:47:39.809227+00:00"
4/16/2025, 6:47:39 PM
model
"openai/o4-mini-2025-04-16"
app_id
182717
streamed
true
cancelled
false
latency
32484
moderation_latency
265
generation_time
35540
tokens_prompt
68
tokens_completion
3175
native_tokens_prompt
66
native_tokens_completion
3194
native_tokens_reasoning
2880
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0141262
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
202
88.29%
Dialogue to Total Word Ratio
Ratio: 44.06%, Deviation: 5.94%
neededClean
false
wordsTotal
202
wordsDialogue
89
94.1370%