NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-14-01JRTSZ154F8JEXKHPJR0KTBY1
Run Details
Model:
GPT-4.1 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
3.0s
Cost:
$0.000518
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
48.8138%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind a row of inflatable pool toys, whispering, "So, the plan is to crash the kiddie party at Mrs. Jenkins’ place. Easy pickings—tons of wrapped loot just sitting there." Bill adjusted his ski mask, eyes darting nervously. "You sure these toys won’t squeak or something? Last thing we need is a rubber ducky alarm." Molly smirked, "Relax. The only noise they issue is giggles. We’ll time it during the cupcake chaos. That’s when everyone’s distracted." Bill scratched his chin. "And the parents? What if they catch us?" "We distract them with my legendary balloon-animal juggling. I made six last night," Molly said proudly. Bill raised an eyebrow. "You? Juggle? I thought you were allergic to latex." "Minor setback," she said unbothered. "Besides, nothing says 'party hero' like a rogue balloon poodle." Bill grinned. "Alright. What about escape?" Molly pointed to the backyard pool. "Diving in, swimming underneath the fence. I borrowed floaties from the toddlers for silent paddling." Bill laughed. "You’re insane. This is our best heist yet or our most embarrassing faceplant.” "Either way," Molly said, "we walk away with every last gift. Except the broccoli-shaped one. That belongs to the birthday kid, not us." Bill nodded, "Deal. Let’s make birthday misery history."
id
"gen-1744656369-PvdvDvndjWpXXVzUGMGd"
upstream_id
"chatcmpl-BMJ6njiL9wjYpsYEJ1FmNWlJNsr84"
total_cost
0.000518
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:46:12.584866+00:00"
4/14/2025, 6:46:12 PM
model
"openai/gpt-4.1-mini-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
479
moderation_latency
123
generation_time
1505
tokens_prompt
68
tokens_completion
306
native_tokens_prompt
67
native_tokens_completion
307
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000518
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
207
0.00%
Dialogue to Total Word Ratio
Ratio: 77.03%, Deviation: 27.03%
neededClean
false
wordsTotal
209
wordsDialogue
161
48.8138%