NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ09PD6BQYZQPGD1TZ5DEZK6
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
14m 56s
Cost:
$0.003866
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
51.40%
Dialogue to Total Word Ratio
(details)
25.7422%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the oversized plastic dinosaur, its neon green spikes glittering in the late‑afternoon sun. She tightened night‑vision goggles that turned the backyard into an underwater rave. Across the lawn, Bill slipped a whoopee‑cushion into his pocket, the faint squeak echoing like a nervous giggle. "Entrance?" Bill whispered, eyes scanning the perimeter. "The slide is our gateway," Molly replied, tapping the metal curve. "Kids will be too busy sliding to notice us." "Pass me the clown suit; we’ll blend in," Bill said, reaching for the garish polka‑dot outfit on her duffel. "If anyone asks, we’re entertainment." Bill lifted a rubber chicken, chuckling. "Piñata—can we snag the candy without turning it into confetti fireworks?" Molly smirked. "A gentle nudge sends sweets skyward while we gather the loot underneath." "All right, I’ve got the perfect angle for the piñata," Molly whispered, pointing at the inflatable bull. "We’ll trigger it with the whoopee‑cushion." "The kitchen is a dead end," she muttered, glancing toward the pantry. "We'll use the pantry door; it’s unguarded." "All right, the presents," Bill said, nodding toward the gift table stacked with glitter‑wrapped boxes. Molly surveyed the towering castle cake. She clicked a tiny device, and the candle lights flickered off. "The cake can be our distraction. I’ll set off the fireworks." Bill raised an eyebrow. "Fireworks in a house full of toddlers?" "Exactly," Molly said, smiling. "Nothing says ‘look over there’ like a burst of sparkle." The two thieves exchanged a high‑five that sounded oddly like a rubber duck squeak. They slipped into the party as the first child zipped down the slide, giggling. "Adults are in the kitchen," Bill whispered, pointing to a cluster of parents. "We move through the pantry," Molly said, already slipping a sack over her shoulder. "Tip the boxes, roll them out the back door." The children’s laughter rose in a chorus of squeals and claps. In the chaos, Molly tugged the sack tighter; its interior rustled with the promise of stolen toys. "Ready?" Bill murmured, voice barely audible over the party music. "Remember the plan," Molly whispered. "Slide, cake, fireworks, grab, and vanish." Bill adjusted his clown nose and gave a mock salute. "Operation Birthday Bash is a go!" The first sparkler fizzed, sending a cascade of glitter into the sky as two silhouettes vanished behind a curtain of streamers, pockets bulging with mischief. Bill nudged the sack as the sparkler sputtered. "Quick, grab the last box before the kids notice the glitter trail." "Got it, the bag's full. Let's vanish before birthday song starts." "See you at the safe house." Outside, kids whispered about the mysterious lights over the yard, unaware their toys had vanished. Police found an empty cake stand, a toppled piñata, and a single clown nose on the grass. Molly and Bill, hidden in an abandoned shed attic, celebrated with soda pop, counting their loot. Night air smelled of burnt sugar, reminder of the chaos they created. By morning the organizers posted a plea online, asking anyone who saw oddly dressed adults to come forward. Little did they know the culprits were planning their next caper—perhaps a retirement‑home bingo night. The chase continues tomorrow again.
created_at
"2026-04-24T17:48:10.329Z"
4/24/2026, 5:48:10 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
622
moderation_latency
(null)
generation_time
894835
tokens_prompt
77
tokens_completion
15353
native_tokens_prompt
129
native_tokens_completion
15437
native_tokens_completion_images
(null)
native_tokens_reasoning
14477
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0038657
router
(null)
provider_responses
0
endpoint_id
"3dc95f88-3960-4c91-b2bb-bbd2f5069c48"
id
"6bd876e78325b6a5abc356f81cd9e306"
is_byok
false
latency
469
model_permaslug
"openai/gpt-oss-120b"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777052890-KZfPmwDSrgGcBaKMQzrd"
session_id
(null)
api_type
"completions"
id
"gen-1777052890-XQvhMy32T8JPC4eMNJf7"
upstream_id
"6bd876e78325b6a5abc356f81cd9e306"
total_cost
0.0038657
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
529
51.40%
Dialogue to Total Word Ratio
Ratio: 33.65%, Deviation: 3.65%
neededClean
false
wordsTotal
529
wordsDialogue
178
25.7422%