NC Bench
Overview
About
Tests
Models
Model
GPT-4.1
Test
Dialogue tags
Scenario
Write 200 words with 50% dialogue
Run ID
2025-04-14-01JRTSS81CNFJ6123CR8DM977G
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly peered through the bush, flattening her polka dot balaclava. "Bill, do you see that clown? He’s got eyes like a hawk." "Relax," Bill whispered, ducking behind her. "Clowns can smell fear, but I came prepared. I rolled in cotton candy for camouflage." "You’re sticky enough already," Molly hissed. "Look, the presents are under the piñata. How do we grab them without alerting the sugar-high monsters?" Bill grinned. "Simple. I’ll start a balloon-animal distraction. Kids love wiener dogs, and I make a mean octopus." "Octopus? That’s not even a land mammal!" "Exactly. The confusion buys us time," Bill declared. "Plus, Brenda, age six, has a jelly allergy. I swapped the cake filling--the ensuing chaos’ll be our opening." "You sabotaged the cake?" "What am I, an amateur? Keep your eye on the purple gift bag. It rattled when I eavesdropped earlier—probably LEGOs." "Fine, but if we get caught, you’re explaining to my gran why I’m wanted for party-favor larceny." Bill winked. "She already thinks I’m a bad influence. Besides, in and out. Five minutes or less." "Remember: don’t trip over the magicians’ hat like last time." Bill frowned. "One rabbit, Molly. Never again." Molly smirked. "Let’s do this."
id
"gen-1744656179-cg8TS3AGZJCd5BqDOCqB"
upstream_id
"chatcmpl-BMJ3jH2Sbnm1VTLZOg46DpIueV5cy"
total_cost
0.002654
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:43:05.046152+00:00"
Monday, April 14th, 2025 at 6:43:05 PM GMT+00:00
model
"openai/gpt-4.1-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
390
moderation_latency
90
generation_time
3950
tokens_prompt
68
tokens_completion
314
native_tokens_prompt
67
native_tokens_completion
315
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.002654
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
201
0.0000%
Dialogue to Total Word Ratio
Ratio: 86.57%, Deviation: 36.57%
neededClean
false
wordsTotal
201
wordsDialogue
174
49.9995%