NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-25-01KMJ7SGZT4ATARY7ESP6QQ98Y
Run Details
Model:
Qwen 3 32B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
19.1s
Cost:
$0.000447
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
5.88%
Dialogue to Total Word Ratio
(details)
2.9824%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly leaned back in her creaky office chair, lollipops dangling from her ears like earrings, and spun slowly while reading a flyer for “Toddler Topsy-Turvy Birthday Bash: 50 Ft. Bounce House, 100% Edible Decor, Zero Chaperones.” Bill, meanwhile, paced like a caged weasel, his trainers squeaking against the linoleum of their rat-infested hideout. “So let me get this straight,” he said, clutching a stress ball shaped like a frowny alien. “We’re breaking into a *kids’ birthday party*? The target is *presents*? Like, wrapped in *dinosaur diapers*? What if they have *adults*?” “They’ll be busy chugging punch and debating whether ‘Baby Shark’ is a lullaby or a war cry,” Molly said, not looking up. “Besides, it’s the *presents* we want. Think of it like a mall heist, but with more glitter and less sad Santas.” Bill froze. “You’re comparing this to a *mall heist*. What if we get *recognized*?” “We’ll wear costumes. The party’s theme is ‘Magical Forest Adventure.’ There’s a *mascot*, Bill. We’ll be the *ones* in the mascot suits. Easy in, easy out.” Bill gaped. “A *mascot*? You mean that… that *seven-foot unicorn* they hired? What if I suffocate? What if I accidentally inhale a balloon? I’ve read about people *dying*, Molly!” “Relax,” Molly said, tossing him a lollipop. “The real unicorn’s taking a lunch break at 3 PM. We’ll swap in, collect the loot, and vanish before the cake toss. Simple.” Bill sucked on the candy, his face twitching. “And the *security*? What if they scan the presents? What if they have *tracking devices*? What if they’ve got a *clueless toddler* who sees us and screams, ‘MOMMY, THOSE UNICORNS ARE STEALING MY STUFF!’?” Molly blinked. “First off, tracking devices in *kids’ toys*? That’s *creepy*. Second, if a kid shouts, we’ll just… *hum the party anthem*.” She whistled a jaunty tune. “‘Rainbow Rump-Rozzie-Roo!’ It’s a lullaby. Everyone zones out.” Bill hyperventilated into the stress ball. “And the *loot*? We can’t exactly waltz out of a bounce house wearing a *pony suit* full of toys!” “We use the *distraction*. The piñata. We ‘accidentally’ whack it open, chaos ensues, parents charge in with candy-filled jaws, and we slip out the fire exit. Classic.” Bill stared. “You’re telling me we’re going to punch a piñata for *fun*. With *kids*. In a room full of *witnesses*. And you call that *classic*?” Molly smiled. “Bill, I’ve called a lot of things ‘classic.’ Trust me. By 3:15, we’ll be sipping champagne in a *different city*. And by 3:30, we’ll be opening our *new* stockpile of organic sippy cups and board games. Technically, we’re just… *recycling*.” Bill glanced at the unicorn costume in the corner of the room. It looked back. “What if I *pee* in the thing?” “Then we’ll have *double* the loot. Go on,” Molly said, shoving him toward the suit. “Change into Sparkles. The kids will *love* you.” “And if I *don’t*?” Molly’s smile didn’t waver. “Then we’re just two grown adults eating lollipops in a room full of Legos. *That’s* the plan.” Bill groaned. “Fine. But if I die, I want my corpse buried in a *Bender the Alien* stress ball.” “Deal. Now let’s go make some *childish dreams come true*.”
created_at
"2026-03-25T10:14:09.662Z"
3/25/2026, 10:14:09 AM
model
"qwen/qwen3-32b-04-28"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
104
moderation_latency
(null)
generation_time
19029
tokens_prompt
93
tokens_completion
1679
native_tokens_prompt
73
native_tokens_completion
1574
native_tokens_completion_images
(null)
native_tokens_reasoning
778
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00044656
router
(null)
provider_responses
0
endpoint_id
"6b8c829d-3094-45e7-8139-0a67e09060c3"
id
"cmpl-4f283002127d414199faf576e2ceecfb"
is_byok
false
latency
47
model_permaslug
"qwen/qwen3-32b-04-28"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1774433649-R4luwcNWDrSvkOm7jZ3b"
upstream_id
"cmpl-4f283002127d414199faf576e2ceecfb"
total_cost
0.00044656
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
529
5.88%
Dialogue to Total Word Ratio
Ratio: 77.53%, Deviation: 7.53%
neededClean
false
wordsTotal
534
wordsDialogue
414
2.9824%