NC Bench
Overview
About
Tests
Models
Model
GPT-4.1 Nano
Test
Dialogue tags
Scenario
Write 200 words with 90% dialogue
Run ID
2025-04-14-01JRTSQG0691GBBQ7SKHN1VJ7X
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly: Okay, Bill, here's the plan. We sneak in through the side door when the kids are busy bouncing. Easy peasy. Bill: Easy? Molly, last time you said that, I ended up with a tutu on my head and a flamingo hat on my back. Molly: Relax. This time, I got us a distraction. Watch this. Hey kids! Look at my magic trick! Bill: You? Magic? What's the trick? Turning a cake into a rabbit? Because that I gotta see. Molly: No, watch. I’ll pretend to vanish, then you dash in and grab the presents. Bill: And I’m supposed to believe that? You can’t even hide your own snacks without dropping them. Molly: Trust me. When I say “abracadabra,” I’m gone, you’re in, grab everything, and then meet me behind the bouncy house. Bill: What if the clown saw us? Or the party mom? She’s got eyes like hawks. Molly: That’s why I’ve got a fake squirrel in my pocket. Distracts the lot. Plus, I’ll be distracting the clown with my “disappearing act.” Bill: And if the distraction doesn’t work? We get caught, or I end up covered in glitter again? Molly: Then we improvise. Remember, Bill, the art of a heist is speed and silence. Bill: Sounds like a recipe for disaster, but hey, what’s life without a little chaos? Let’s do this.
id
"gen-1744656123-gPr5D4HcqXaGBn7Lz0Uw"
upstream_id
"chatcmpl-BMJ2q5P2lUJP6XQq63j4WkEsghEab"
total_cost
0.0001343
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:42:07.572613+00:00"
Monday, April 14th, 2025 at 6:42:07 PM GMT+00:00
model
"openai/gpt-4.1-nano-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
659
moderation_latency
143
generation_time
2083
tokens_prompt
68
tokens_completion
320
native_tokens_prompt
67
native_tokens_completion
319
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0001343
Evaluation details
Result
Evaluator
Details
Meta Data
3.6234%
Matches word count
n/a
neededClean
false
words
224
0.0000%
Dialogue to Total Word Ratio
Ratio: 1.34%, Deviation: 88.66%
neededClean
false
wordsTotal
224
wordsDialogue
3
1.8117%