NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ05XHF7N8YHXY2KC7E6JFDA
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
31.7s
Cost:
$0.000232
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
29.24%
Dialogue to Total Word Ratio
(details)
14.6221%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The night before the party, Molly and Bill crouched behind the dumpster outside the community center, the smell of cheap pizza still clinging to the air. The building loomed like a glitter‑covered fortress, its windows festooned with balloon arches and a banner that read “Welcome to Timmy’s 7th Birthday Bash!” A stray stream of glitter fluttered like confetti in the breeze, reminding them just how ridiculous this job sounded. Molly tapped a finger against the dented metal lid, eyes scanning the crack in the side door. Bill adjusted his black beanie, his breath forming clouds that vanished in the cold. Their plan was simple: slip in, grab every gift, and disappear before any parent realized the missing toys. “Okay, Bill, you’re on lookout,” Molly whispered, crouching lower. “Got it. My eyes are like hawks—if hawks were terrified of balloons,” Bill replied, mimicking a dramatic swoop. “What’s the entry point again?” “The side vent. Small enough for me, big enough for a sack of toys.” “You sure that won’t set off the alarm? I heard they upgraded the security after last year’s clown fiasco.” “Only if I sneeze. And I’ve been taking antihistamines all week.” “Makes sense. I’ll handle the lock. Got the hairpin, right?” “Yeah, but it’s the kind of hairpin you’d find in a wig shop, not a locksmith’s toolbox. Still works on a toddler’s password pad, though.” “You’re full of confidence,” Molly said, rolling her eyes. “Confidence is free, unlike the glitter they used on the floor. That stuff sticks to everything, even my conscience.” “Speaking of glitter, how do we get past the candy‑corn guard? He’s got a thing for licorice.” “I’ll distract him with a decoy—maybe a squeaky rubber duck. Kids love those.” “Good plan. I’ll slip in behind the cake table. The big chocolate one is our beacon, right?” “Exactly. The frosting glow is like a lighthouse for thieves.” “What about the parents? They’ll be on the patio, sipping punch and gossiping about the daycare’s new policies.” “They’ll be too busy pretending they don’t know how to dance. We’ll be in and out before the first ‘Let’s take a group photo!’” “Group photo—perfect. We can hide in the backdrop and blend with the streamers.” “Just make sure you don’t get caught in the bubble machine. Last time I inhaled a bubble and coughed up a clown nose.” “Note to self: avoid bubble zones. Got it.” “How many presents are we talking about?” “Forty‑seven. Plus a mysterious wrapped box from the ‘Secret Santa’—probably an adult’s joke, but worth grabbing.” “And the piñata? That thing’s a gold mine of candy.” “Don’t think I haven’t considered it. I’ll bring the bat. But remember, we’re not stealing the candy. We’re stealing the presents. The kids already have sugar highs.” “Right, the bat! I’ve got a spare in my glove compartment. It’s actually a baseball bat, but that’s close enough.” “Perfect. We meet at midnight. One last thing—if you hear a kid crying ‘It’s my birthday!’ just pretend you’re the birthday clown and hand them a balloon.” “Deal. I’ll practice my clown voice. Honk honk, happy birthday!” “Molly, you’re going to be the mastermind on the Taco Bell run after this. We’ll need fuel for the getaway.” “Yes, but only if we can stop for a donut. I’ve never stolen a donut before.” “Then we’re set. Ready to become the most infamous present thieves in the neighborhood?” “Absolutely. Let’s make this party unforgettable—for all the wrong reasons.” Bill tapped his beanie, and the duo slipped silently into the shadows as the moonlight caught the remnants of confetti, heralding a night of mischief.
created_at
"2026-04-24T16:42:09.782Z"
4/24/2026, 4:42:09 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
236
moderation_latency
(null)
generation_time
31533
tokens_prompt
77
tokens_completion
1462
native_tokens_prompt
114
native_tokens_completion
1197
native_tokens_completion_images
(null)
native_tokens_reasoning
409
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000231876
router
(null)
provider_responses
0
endpoint_id
"85835306-dba3-4334-87bc-ab9ffa5c9c62"
id
"chatcmpl-RpyW6fScmsgM3SoCFYwj9JAa"
is_byok
false
latency
79
model_permaslug
"openai/gpt-oss-120b"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777048929-W0X30WBRTDdXjLlG08Fw"
session_id
(null)
api_type
"completions"
id
"gen-1777048929-sZVR1ppM7VyGoo4CWyOH"
upstream_id
"chatcmpl-RpyW6fScmsgM3SoCFYwj9JAa"
total_cost
0.000231876
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
607
29.24%
Dialogue to Total Word Ratio
Ratio: 74.96%, Deviation: 4.96%
neededClean
false
wordsTotal
607
wordsDialogue
455
14.6221%