NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ07F5BVKZX3AF36T49RJN5E
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
28.5s
Cost:
$0.000224
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the inflatable bouncy castle, her sneakers squeaking against the plastic as she whispered to the shadow of a balloon arch. The backyard was a kaleidoscope of pastel streamers, glitter‑covered tables, and a mountain of brightly wrapped parcels that towered like a small, guilty mountain. She could smell the sugary perfume of cake and the faint, metallic tang of new toys still sealed in shrink wrap. Bill was already perched on the edge of the kiddie pool, his lanky silhouette reflected in the teal water, his grin as crooked as the ribbon on a birthday present. "All right, Bill," Molly hissed, "we've got exactly seven minutes before the clown shows up. The gate's still unlocked, but the neighbor's cat patrol is on schedule." She adjusted the tiny night‑vision goggles strapped to her head, their lenses glowing a soft green. "Seven minutes?" Bill replied, flicking a plastic cup toward a pile of candy. "That's barely enough time to count a single bow." Molly rolled her eyes. "Don't worry, I've mapped out the gift distribution. The big box under the tree is the jackpot. That's where they stashed the tablets and the deluxe Lego set. The rest are just plushies and crayons." Bill nudged the pool float with his foot. "Plushies are worth more than I thought. If we can get those, the resale market for 'vintage 2025 cuddles' is booming." She smirked, tapping the earpiece of his tiny walkie‑talkie. "Remember the plan: you distract with the balloon-animal parade, I'll slip into the gift tent and bag the loot. We both wear the same neon‑green night‑glow vests so the kid's parents won't suspect a thing." "Right, the vests," Bill repeated, checking his reflection in the metallic foil of the party lights. "If anyone asks, we're just… event staff. Professional balloon‑animal artists with a penchant for glow-in-the-dark fashion." Molly laughed quietly. "Exactly. And I've got the perfect line for the parents. 'Sorry, kids, the clowns are running late—our glow‑in‑the‑dark mascot is here to entertain you!'" Bill raised an eyebrow. "A mascot? I thought we were the mascot." "We'll wear the dinosaur costume," she whispered, pointing to a towering inflatable T‑rex that loomed near the cake table. "Gives us cover, plus the kids love dinosaurs. While they're staring at the dinosaur, I'll be in the tent, cracking the gift code with my portable lock‑pick set." "A lock‑pick set?" Bill echoed, his voice barely audible over the soft whirr of a ceiling fan that was blowing the confetti into the night air. "Did you bring the magnetic screwdriver? I heard the gift boxes are sealed with magnetic strips." Molly reached into her satchel and pulled out a sleek, silver tool shaped like a wand. "Got it, and I also have a tiny EMP gadget to temporarily disable the motion sensors on the gift table. One zap and the alarms go mute for twenty seconds." Bill tapped his chin theatrically. "And the cat? We can't have Whiskers pouncing on us while we make a clean getaway with a mountain of toys." "That's when the inflatable dinosaur comes in," Molly said, her eyes sparkling with mischief. "We'll roll it past the fence, and Whiskers will think it's a new playmate. He'll be too busy chasing it to notice us slipping away with the presents." The distant sound of a giggle drifted from the next yard, and the smell of frosting grew stronger. The party music, a looping mix of children's songs, thumped softly in the background. Molly adjusted the strap of her backpack, where the empty sack waited like a black hole ready to swallow every wrapped surprise. "We're good to go?" she asked, her voice a low chuckle. Bill pulled the dinosaur costume's hood over his head, the fabric rubbing against his cheek. "All set. Let the party begin, and may the best thief win." Molly grinned. "And may the kids never know their presents were stolen by a glitter‑covered duo of glow‑in‑the‑dark thieves." She gave a final nod, and together they slipped into the neon night, the children's laughter echoing behind them like a mischievous soundtrack.
created_at
"2026-04-24T17:09:15.788Z"
4/24/2026, 5:09:15 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
241
moderation_latency
(null)
generation_time
28348
tokens_prompt
77
tokens_completion
1354
native_tokens_prompt
114
native_tokens_completion
1158
native_tokens_completion_images
(null)
native_tokens_reasoning
296
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000224466
router
(null)
provider_responses
0
endpoint_id
"85835306-dba3-4334-87bc-ab9ffa5c9c62"
id
"chatcmpl-RCQ6IlEsJpwBEBgL60vishGz"
is_byok
false
latency
99
model_permaslug
"openai/gpt-oss-120b"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777050555-7a16VuSdupIxRJTq8seT"
session_id
(null)
api_type
"completions"
id
"gen-1777050555-PjkQ6p17LpVcixm4YbTR"
upstream_id
"chatcmpl-RCQ6IlEsJpwBEBgL60vishGz"
total_cost
0.000224466
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
684
0.00%
Dialogue to Total Word Ratio
Ratio: 51.45%, Deviation: 21.45%
neededClean
false
wordsTotal
688
wordsDialogue
354
0.0000%