NC Bench
Overview
About
Tests
Models
Model
GPT-4.1 Nano
Test
Dialogue tags
Scenario
Write 200 words with 90% dialogue
Run ID
2025-04-14-01JRTSTRXSCJ170Z4B3W5RGK7H
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly: Okay, Bill, hear me out. We infiltrate the birthday bash, grab the presents, and nobody's the wiser. Bill: Molly, are you sure that’s a good idea? Kids parties are chaos. We’ll end up with whipped cream on our faces and balloon animals in our hair. Molly: That’s the charm! Plus, the presents are stacked right by the snack table. Easy pickings. Bill: But what if the parents catch us? Kids are loud, and I’m terrible at hiding my excitement. Remember last time? I tripped over a piñata. Molly: That was your fault for trying to fit inside a giant cupcake. Here’s the plan: I distract the clown, you cover the gift stack. Bill: The clown? Molly, he’s huge and painted like a rainbow tornado. Do you really want to tangle with Mr. Jingles? Molly: Trust me. I’ll charm him with my fake microphone. Meanwhile, you stealthily grab the presents. Quick and quiet. Bill: What if we get caught? Do we run, or try to negotiate with a five-year-old? Molly: No negotiations. Just swift action. And remember, no candy deposits in the bag. We’re not trying to become party favors ourselves. Bill: Fine. But if I get caught, you’re explaining this to my parole officer. Molly: Deal. Now, let’s make it snappy before the cake is cut.
id
"gen-1744656229-QanPo8eHMtiiMQem9kCk"
upstream_id
"chatcmpl-BMJ4YvrB7dNjL42EZMLe6WIgdRJSI"
total_cost
0.0001283
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:43:53.3064+00:00"
Monday, April 14th, 2025 at 6:43:53 PM GMT+00:00
model
"openai/gpt-4.1-nano-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
349
moderation_latency
173
generation_time
1818
tokens_prompt
68
tokens_completion
303
native_tokens_prompt
67
native_tokens_completion
304
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0001283
Evaluation details
Result
Evaluator
Details
Meta Data
27.1658%
Matches word count
n/a
neededClean
false
words
219
0.0000%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 90.00%
neededClean
false
wordsTotal
219
wordsDialogue
0
13.5829%