NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ09QMM17Y6QZC25Q1A8NAVS
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
6m 25s
Cost:
$0.001141
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.01%
Matches word count
(details)
80.50%
Dialogue to Total Word Ratio
(details)
40.2574%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse was a cluttered mess of cardboard boxes, leftover costume shoes, and a grin that seemed to pulse from the flickering fluorescent lights. Molly leaned against a stack of alphabet blocks, her pink hair sticking out like a carnival tent. Bill, still wearing his faded superhero cape, was polishing a tiny toy flashlight he had stolen from a lost-and-found bin. The plan was simple: slip into the community center where a bright‑colored birthday party for five‑year‑olds was in full swing, grab every packaged gift, and disappear before the cake was cut. The room smelled of candy floss and the faint ozone of a misplaced confetti cannon. Molly tapped the instruction sheet with a plastic golden crown, her eyes glittering with mischief. Bill glanced at the schedule taped to the wall, a crayon‑drawn timeline of balloon releases and clown rehearsals. He whispered, “If we time it right, the kids will be distracted by the piñata, and we’re golden.” Molly: "Alright, Bill, we’ve got thirty minutes before the clown shows up. You think the door’s as easy as the brochure says?" Bill: "If the brochure says ‘Welcome!’ it’s probably a trap. But I’ve got the master key—borrowed from the janitor’s lost‑and‑found." Molly: "Lost‑and‑found? That’s where they keep the leftover clown noses and the half‑eaten cupcakes." Bill: "Exactly. Nothing says ‘secure’ like a pile of cotton candy wrappers." Molly: "So what’s the plan? Sneak in through the gift table or parachute in from the balloon arch?" Bill: "Parachutes are messy, and balloons will attract the kids. I vote for the gift table. We’ll pose as entertainers." Molly: "Entertainers? With what? My pink hair and a glittery cape?" Bill: "Your hair can be the ‘magical unicorn.’ I’ll be ‘Captain Distraction’ with this laser pointer." Molly: "Laser pointer? That’s from a TV remote, right?" Bill: "Exactly, we’ll point it at the kids, they’ll chase the red dot, and we’ll cart out the boxes." Molly: "And the cupcakes?" Bill: "We leave those. Kids love them. We’re thieves, not monsters." Molly: "Fine. What about the security guard? He looks like he’s auditioning for a ‘big brother’ drama." Bill: "He’s half‑asleep. I have a cup of decaf espresso in my pocket. One sip, and he’ll think he’s dreaming." Molly: "Decaf? That’s the best defense." Bill: "Trust me, the guard will be snoozing faster than a bedtime story." Molly: "Alright, I’ll grab the pink unicorn horn from the costume rack. You ready to jingle the bell?" Bill: "The bell? Oh, you mean the ‘shiny’ one on the cake table. I’ll ring it when we’re out. Gives us an alibi." Molly: "Brilliant. We’ll blame the noise on the clown’s squeaky shoes." Bill: "And if the kids notice missing presents?" Molly: "We’ll claim it’s a ‘gift redistribution program.’ The mayor’s idea." Bill: "The mayor’s idea? That’s bold." Molly: "Bold is our middle name. Remember, we’re the only adults who can get away with stealing candy." Bill: "And the presents. Don’t forget the presents." Molly: "One last thing—if we get caught, I’ll blame the clown." Bill: "Deal. And if the clown catches us, we’ll bargain for a free balloon animal." Molly: "Perfect. Now, let’s move before the first balloon pops."
created_at
"2026-04-24T17:48:50.699Z"
4/24/2026, 5:48:50 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
832
moderation_latency
(null)
generation_time
384842
tokens_prompt
77
tokens_completion
4557
native_tokens_prompt
129
native_tokens_completion
4539
native_tokens_completion_images
(null)
native_tokens_reasoning
3675
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0011412
router
(null)
provider_responses
0
endpoint_id
"3dc95f88-3960-4c91-b2bb-bbd2f5069c48"
id
"bd086dfe59859fdf81adcfb2fcc7378c"
is_byok
false
latency
591
model_permaslug
"openai/gpt-oss-120b"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777052930-PQECITUl79aEwdDcl8oe"
session_id
(null)
api_type
"completions"
id
"gen-1777052930-Xf9PPRFLJ103kXSqzEOT"
upstream_id
"bd086dfe59859fdf81adcfb2fcc7378c"
total_cost
0.0011412
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.01%
Matches word count
n/a
neededClean
false
words
531
80.50%
Dialogue to Total Word Ratio
Ratio: 67.92%, Deviation: 2.08%
neededClean
false
wordsTotal
533
wordsDialogue
362
40.2574%