NC Bench
Overview
About
Tests
Models
Model
GPT-4.1
Test
Dialogue tags
Scenario
Write 200 words with 50% dialogue
Run ID
2025-04-14-01JRTSRAC8YC8786TKE3C04DFY
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“We’re really doing this, Bill. Children’s party. Presents. Do you feel shame at all?” “Shame is for amateurs, Molly. That’s why we’re professionals. Now, what’s the plan?” “I distract the clown. You handle the cupcake table. That’s where they stash the loot.” “Cupcakes? Not a safe. I’m insulted.” “Don’t get picky, Bill. These kids are ruthless. Remember last year’s lemonade stand scam?” “Ha! The decoy with the golden retriever. Nearly took my hand off.” “Focus. If the music starts, blend in. I hear the Hokey Pokey is your thing.” “Only if you put your left foot in too, partner. Deal?” “Deal. After the gifts are secured, back out through the bouncy castle.” “Won’t we look suspicious?” “Not if you keep your shoes off. Trust me.” “My pinky toe is still recovering from last time.” “Then wiggle, don’t walk. It’s about finesse.” “I’m writing that down. ‘Successful crimes, tip one: wiggle away.’” “Stop joking. We’re on a schedule. The piñata drops at four. We have twelve minutes.” “Don’t worry, Molly. I’ll be the invisible man.” “Right. Wearing a SpongeBob hat.” “Blends in. This operation’s watertight.” “Bill, you’re either a genius or doomed.” “That’s the spirit.”
id
"gen-1744656149-Ps1lIDBpSjdzu9TlhmNV"
upstream_id
"chatcmpl-BMJ3FUfXespjRQ2mlOw5cSzWrFsmN"
total_cost
0.002734
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:42:37.877036+00:00"
Monday, April 14th, 2025 at 6:42:37 PM GMT+00:00
model
"openai/gpt-4.1-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
610
moderation_latency
192
generation_time
7252
tokens_prompt
68
tokens_completion
324
native_tokens_prompt
67
native_tokens_completion
325
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.002734
Evaluation details
Result
Evaluator
Details
Meta Data
98.7124%
Matches word count
n/a
neededClean
false
words
194
0.0000%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 50.00%
neededClean
false
wordsTotal
194
wordsDialogue
194
49.3562%