NC Bench
Overview
About
Tests
Models
Model
GPT-4.1 Nano
Test
Dialogue tags
Scenario
Write 200 words with 90% dialogue
Run ID
2025-04-14-01JRTSSCE7ACCC3ZDGYPQVP45N
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly: Okay, Bill, hear me out. We’re crashing that kids’ party tomorrow. Bill: The one at the Johnsons’ house? Why? There’s probably like twenty little monsters running around. Molly: Exactly. No parents, no security. It’s the perfect cover. Plus, the presents will be sitting out in the open. Bill: Yeah, but how do we get in? I don’t want to be the kids’ new favorite villains. Molly: Simple. We dress as clowns. Kids love clowns, right? Confuse them so much they forget what’s happening. Bill: That’s brilliant—scare them into compliance. I’ve got a clown wig and red nose. Do you have face paint? Molly: Got it. I’ll do the hair. Glasses, oversized shoes... We’ll be unrecognizable. Just don’t laugh too hard when the party starts. Bill: Oh, I will. And what about the back door? We need an entry point. Molly: I saw the delivery window on the side. It’s unlocked after five. We slip in, grab the presents, and slip out. Bill: Easy peasy. I can already hear the kids screaming, ‘The clowns are here!’ Just hope they’re more scared of losing their loot than us. Molly: They’ll never see us coming, Bill. Quiet as mice, dressed as circus performers. Bill: Operation Kids’ Present Raid, beginning tomorrow. Let’s do this.
id
"gen-1744656184-0vFeo0X0OOXJj1kHFhvH"
upstream_id
"chatcmpl-BMJ3oaLA3VPeHNskwAM2qU5Xnxaqw"
total_cost
0.0001311
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:43:14.565944+00:00"
Monday, April 14th, 2025 at 6:43:14 PM GMT+00:00
model
"openai/gpt-4.1-nano-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
360
moderation_latency
212
generation_time
8878
tokens_prompt
68
tokens_completion
310
native_tokens_prompt
67
native_tokens_completion
311
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0001311
Evaluation details
Result
Evaluator
Details
Meta Data
81.2727%
Matches word count
n/a
neededClean
false
words
212
0.0000%
Dialogue to Total Word Ratio
Ratio: 13.68%, Deviation: 76.32%
neededClean
false
wordsTotal
212
wordsDialogue
29
40.6364%