NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-14-01JRTSWFZMDENPNSVNG0QNBT4V
Run Details
Model:
GPT-4.1 Nano
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
2.0s
Cost:
$0.000123
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
98.24%
Dialogue to Total Word Ratio
(details)
92.3098%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly: "Okay, Bill, here’s the plan. We sneak in through the backyard, hide behind the bounce house, then grab the presents when no one’s looking." Bill: "Whoa, Molly, are you sure that’s foolproof? Little Timmy’s parties are crazier than a clown convention." Molly: "That’s why we wait until the balloon artist is in the taco stand. Distraction, Bill." Bill: "But what if the kids see us? Do we look like professional thieves?" Molly: "We look like kids at a party, that’s what. Just act natural. No sudden moves. Easy peasy." Bill: "Got it. Wait, what about the cake?" Molly: "Cake is a trap. Sticky, messy, and suspiciously tempting. We focus on the presents." Bill: "I dunno, Molly. I’ve never stolen from a kid’s party before. Think we’ll get away with it?" Molly: "Bill, relax. We've pulled off bigger heists. Remember the donut shop scare? This is easier." Bill: "Yeah, but that was just donuts. Here we’re talking about shiny toys, glitter, and tiny screaming kids." Molly: "Exactly. That’s what makes it fun. Come on, partner. Adventure calls." Bill: "You’ve got a plan, I’ve got my backpack. Let’s do this."
id
"gen-1744656286-YLD8NQCVejmpQwyZj4D0"
upstream_id
"chatcmpl-BMJ5SL5mLld9y2WInd3g7jrP5MFSe"
total_cost
0.0001235
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:44:48.104092+00:00"
4/14/2025, 6:44:48 PM
model
"openai/gpt-4.1-nano-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
246
moderation_latency
171
generation_time
768
tokens_prompt
68
tokens_completion
292
native_tokens_prompt
67
native_tokens_completion
292
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0001235
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
189
98.24%
Dialogue to Total Word Ratio
Ratio: 93.65%, Deviation: 3.65%
neededClean
false
wordsTotal
189
wordsDialogue
177
92.3098%