NC Bench
Overview
About
Tests
Models
Model
GPT-4.1 Nano
Test
Dialogue tags
Scenario
Write 200 words with 50% dialogue
Run ID
2025-04-14-01JRTSXABXQQAPKPTPHNMZ1EEW
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly peeked around a corner, clutching a rolled-up blueprint. “Bill, are you sure about this? Stealing from kids’ parties seems wrong, even if it’s hilarious.” Bill grinned mischievously. “Come on, Molly. Think of the chaos—balloons popping, broccoli-shaped cupcakes abandoned, tiny hands clutching massive stuffed animals. We could be heroes... or villains. Depends how you look at it.” Molly rolled her eyes. “Heroes don’t rob birthday parties. We’d get caught by the cake police. And what about the parents? They’d call the cops—or worse, the ‘No Fun’ squad.” Bill chuckled. “That’s the fun part! Imagine the looks on their faces when they realize their gifts are gone. Besides, we’re just borrowing the presents. We’ll return them after the party—think of it as a temporary art installation.” Molly crossed her arms. “Temporary art? You’re terrible. Okay, tell me the plan again. How do we get past the giant inflatable obstacle course?” Bill pointed toward the backyard. “We sneak through the hedge maze, dodging the jumping castles, then scale the candy waterfall. Simple.” Molly hesitated, then smirked. “All right. But if we get caught, I blame you for the salad dressing ambush.”
id
"gen-1744656313-ND4Ui782YYMEffhFvscx"
upstream_id
"chatcmpl-BMJ5tAkAULdrGfa23Zyee5Fk0dXKh"
total_cost
0.0001171
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:45:17.36167+00:00"
Monday, April 14th, 2025 at 6:45:17 PM GMT+00:00
model
"openai/gpt-4.1-nano-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
379
moderation_latency
89
generation_time
2745
tokens_prompt
68
tokens_completion
275
native_tokens_prompt
67
native_tokens_completion
276
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0001171
Evaluation details
Result
Evaluator
Details
Meta Data
98.7124%
Matches word count
n/a
neededClean
false
words
194
0.0000%
Dialogue to Total Word Ratio
Ratio: 83.51%, Deviation: 33.51%
neededClean
false
wordsTotal
194
wordsDialogue
162
49.3562%