NC Bench
Overview
About
Tests
Models
Model
GPT-4.1 Mini
Test
Dialogue tags
Scenario
Write 200 words with 90% dialogue
Run ID
2025-04-14-01JRTST3R8TGGEMP4X9HYF4X6C
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Alright, Bill, you sure about this? A kids' party? That’s... ambitious.” “Ambitious? Molly, think about it. Sugar-high chaos, distracted parents, tiny ninjas everywhere. Perfect cover.” “Tiny ninjas? You mean the kids?” “Exactly! And they’re busy screaming over balloon animals. Nobody’s watching the presents.” “Okay, but have you thought about the dog? That Great Dane is like a furry alarm system.” “Dog? Pfft. Bribe him with a meatball sub from Tony’s Deli. He’s a softie.” “And the clown? You don’t trust the clown?” “Clowns are unpredictable. But I’m banking on the fact he’s too busy tripping over his oversized pants to notice two slightly suspicious adults.” “Right. So, what’s the plan? Disguise as party entertainers?” “Better. We roll in as balloon animal artists. Molly, you’ve got the dexterity for it.” “Last time I tried, someone’s eye almost fell out.” “Practice makes perfect. Plus, if we mess up, the chaos will just blend right in.” “And the getaway?” “We stuff the loot in piñatas, toss them out the back door, then sprint like kids chasing candy.” “You’re insane, Bill.” “Insane? Maybe. But also rich.”
id
"gen-1744656208-pAGX0JVDIbrbz0uN979X"
upstream_id
"chatcmpl-BMJ4CWGoimmEoTzKktQq6cl83ktzu"
total_cost
0.00047
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:43:37.546212+00:00"
Monday, April 14th, 2025 at 6:43:37 PM GMT+00:00
model
"openai/gpt-4.1-mini-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
657
moderation_latency
100
generation_time
8174
tokens_prompt
68
tokens_completion
275
native_tokens_prompt
67
native_tokens_completion
277
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00047
Evaluation details
Result
Evaluator
Details
Meta Data
43.3783%
Matches word count
n/a
neededClean
false
words
183
36.7879%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
183
wordsDialogue
183
40.0831%