NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ07C0VCJ3ZYGGARXMPBVBZY
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
52.0s
Cost:
$0.000393
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
2.47%
Dialogue to Total Word Ratio
(details)
1.2355%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The night was a soft blanket of fog over the suburb, the only sounds the distant bark of a dog and the muffled giggle of a child from the house across the street. A single streetlamp flickered, casting a golden halo on the freshly cut grass of the Andersons' backyard, where a colorful banner read “Happy Birthday, Timmy!” Molly crouched behind a hedge, eyes scanning the scene, while Bill adjusted his backpack, the weight of the tools inside humming against his spine. "Molly, you see the two side doors?" Bill whispered, his voice thick with anticipation. "Yeah, the left one is unlocked—thanks to that careless nanny. The right one has a lock, but I brought a lockpick set. We’ll be in before they notice the cake." "Timing is everything," Molly replied, tapping her foot. "The clown leaves at twenty past eight, and the parents will be distracted with the piñata at nine. That two‑minute window is our sweet spot." "The kids' party is a gold mine of wrapped dreams," Bill chuckled, glancing at the mountain of gift boxes stacked near the gazebo. "Imagine the loot: action figures, video games, that limited‑edition dinosaur set—" "Molly, you got the black gloves?" "Right here," she said, slipping the leather pair on. "And the night‑vision goggles, just in case they turn the lights on early." "Good. I’ll rig the alarm bypass with this little jammer," Bill said, pulling out a sleek device. "It should keep the motion sensors asleep for ten minutes." "You think thirty seconds is enough to grab the biggest box?" Molly asked, eyes darting to a towering present wrapped in glittery blue paper. "Thirty seconds, maybe a minute if we move fast. We only need the top three, the rest can stay for the kids. That way they won’t notice the missing stuff until after the party." "The parents will think the kids just ate too much candy," Bill laughed. "We’ll be gone before the first ‘thank you’ speech." "Alright, plan recap," Molly said, pulling out a crumpled napkin. "Enter left door at 20:45, disable alarm by 20:46, collect the top three boxes by 20:48, exit right door at 20:49, vanish into the night. No trace." "Sounds flawless," Bill replied, nodding. "Just one thing—" "Molly, what if the clown decides to do a surprise magic trick and pulls a rabbit out of a hat right when we’re inside?" Bill joked, trying to keep the tension light. "Might as well be a magician with a wand," Molly shot back, smirking. "If he does, we’ll just pretend we’re part of the show. ‘Abracadabra, where’s the present?’" "They’ll never suspect a thing," Bill said, adjusting his cap. "Now, any last concerns?" "Molly stared at the sky, then whispered, "Only one—don’t let the kids notice the missing gifts. That would ruin the whole… festive mood." "Don’t worry," Bill said, tapping his pocket where the loot bag waited. "We’re the ultimate party crashers." With that, they slipped from their hiding spot, the night swallowing their silhouettes as they moved toward the glowing backyard, ready to rob a birthday celebration of its childhood treasures. The streetlights flickered one last time as they slipped away, leaving only a handful of tinsel and a faint scent of chocolate in the air.
created_at
"2026-04-24T17:07:32.856Z"
4/24/2026, 5:07:32 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
240
moderation_latency
(null)
generation_time
51943
tokens_prompt
77
tokens_completion
2359
native_tokens_prompt
114
native_tokens_completion
2043
native_tokens_completion_images
(null)
native_tokens_reasoning
1520
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000392616
router
(null)
provider_responses
0
endpoint_id
"85835306-dba3-4334-87bc-ab9ffa5c9c62"
id
"chatcmpl-RPQXIsP0p55Q3j8YKZOOub1y"
is_byok
false
latency
79
model_permaslug
"openai/gpt-oss-120b"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777050452-yV0YGlqzJ1VZxGO83f09"
session_id
(null)
api_type
"completions"
id
"gen-1777050452-KinOxIUPossiBazP2FyN"
upstream_id
"chatcmpl-RPQXIsP0p55Q3j8YKZOOub1y"
total_cost
0.000392616
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
544
2.47%
Dialogue to Total Word Ratio
Ratio: 61.40%, Deviation: 8.60%
neededClean
false
wordsTotal
544
wordsDialogue
334
1.2355%