NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0A0AY8S9QPAWDJA82XZGEJ
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 50s
Cost:
$0.000915
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
93.65%
Matches word count
(details)
99.44%
Dialogue to Total Word Ratio
(details)
96.5467%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Rain hammered the cracked windshield of the van as Molly adjusted the cracked mirror, trying to see the cul‑de‑sac where the children's party was set for seven. Bill lounged in the passenger seat, chewing a stale granola bar and flicking plastic confetti from his jacket. "Molly, you think a toddler's birthday party is worth it?" Bill asked, his voice muffled by engine hum. "We've hit paintball arenas, antique dolls, even a giraffe sanctuary. Gifts are treasure," Molly replied, tapping the red dot on the map. "And the guests? Kids and cupcake‑armed parents?" "They'll be busy smashing cake." "Okay, here's the plan." She crouched behind a dinosaur‑shaped cake stand. "We go in through the side door, grab all the boxes, and meet back at the van in ten minutes." "What about security? Mom's big red hair." "We'll wear clown masks." Bill lowered his voice, eyes scanning the room. "Bill, the kids will be glued to the streamers, we can slip by unnoticed." Molly grinned. "Molly, the presents are stacked high, we need a cart to move them fast." "Don't forget the security camera near the cake table, we have to loop it." The van rolled to a stop behind a hedge of overgrown ferns. They slipped out, shoes muffled by damp grass, and approached the pastel‑painted two‑story house. A banner swayed: Happy 4th Birthday, Timmy! Balloons bobbed like oversized dandelions, and a low‑volume pop song thumped from a DJ booth. Inside, plastic tables and tiny chairs formed a maze. A mountain of gift‑wrapped boxes rose like a colorful avalanche. The scent of frosting hung heavy in the air. Molly slipped through the side door, her clown mask hiding her grin. Bill followed, ducking behind a tower of stuffed animals. He slipped a toy steamroller into his pocket, his grin widening. "The kids' birthday song will give us exactly three minutes, then chaos," Bill whispered, eyes darting to the clock. "If we get caught, we can claim we were hired as magicians," Molly muttered, loading a box of Lego bricks into her sack. A sudden squeal erupted from the corner where a toddler attempted a somersault off the couch. Parents rushed over, shouting “Watch out!” and “Don’t do that!” The distraction was perfect. "Quick, the exit’s this way," Molly hissed, pointing to a kitchen doorway. They hurried past the refreshment table, Bill pocketing a handful of candy canes. One more thing, the neighbor's dog barks at anything with a squeaky toy, Molly warned, shoving a noisy toy apart as they passed. "We're out in a flash," Bill said, sealing the sack. The two thieves slipped out the back, the party’s laughter fading behind them as the night swallowed the stolen treasure. The van idled in the driveway, its interior now stuffed with plush toys, gadgets, and a mountain of candy‑wrapped presents. Molly laughed, feeling the weight of their loot, her heart thudding in time with the fading bass of the party music. Bill turned the key, and the engine roared to life, carrying them away from the chaos they’d left behind.
created_at
"2026-04-24T17:53:35.701Z"
4/24/2026, 5:53:35 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
223
moderation_latency
(null)
generation_time
108900
tokens_prompt
77
tokens_completion
4914
native_tokens_prompt
114
native_tokens_completion
4795
native_tokens_completion_images
(null)
native_tokens_reasoning
4108
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000915496
router
(null)
provider_responses
0
endpoint_id
"85835306-dba3-4334-87bc-ab9ffa5c9c62"
id
"chatcmpl-RZPxr8lU43odV2SUMjRkysuz"
is_byok
false
latency
76
model_permaslug
"openai/gpt-oss-120b"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777053215-SE1H8HDA5DTYoLPOdByQ"
session_id
(null)
api_type
"completions"
id
"gen-1777053215-BKTXLk7Zoxow6JfDiRSM"
upstream_id
"chatcmpl-RZPxr8lU43odV2SUMjRkysuz"
total_cost
0.000915496
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
93.65%
Matches word count
n/a
neededClean
false
words
509
99.44%
Dialogue to Total Word Ratio
Ratio: 29.67%, Deviation: 0.33%
neededClean
false
wordsTotal
509
wordsDialogue
151
96.5467%