NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-25-01KMJ8J0VC347RFDNZXQ1VD6CW
Run Details
Model:
Qwen3 235B A22B Instruct 2507
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
37.5s
Cost:
$0.000487
Provider:
Parasail
Evaluation summary
Score
Evaluator
Details Link
2.01%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
1.0058%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The alley behind Bounce ‘n’ Gloom Party Emporium smelled faintly of stale cotton candy and existential dread. Molly crouched behind a dumpster, squinting at the building’s back window. Bill adjusted his neon-green party hat and chewed on a deflated balloon. “You’re wearing the hat *now*?” Molly whispered. “It’s camouflage,” Bill whispered back, puffing up the balloon again with a wheeze. “We’ll blend in. We’re party professionals.” “We’re thieves pretending to be clowns.” “Well, one of us is. You look like a disgruntled hall monitor.” Molly tugged at her black turtleneck. “This is practical. You look like a piñata that got rejected for being too depressing.” “You’re just mad because I found the glittery cape.” “You can’t wear a glittery cape. You’ll set off motion sensors. Or worse—kids will want to hug you.” “So? I’m great with kids. I taught my cousin’s hamster to high-five.” “That’s not a skill, Bill. That’s a cry for help.” “Point is, we go in during the cake chant. That’s when everyone shouts, eyes closed. We’ll be ghosts in a sea of sugar rage.” “You’re assuming they close their eyes. Kids don’t blink during cake. They’re like tiny, sugar-fueled hawks.” “Then we wear masks.” “We can’t wear masks at a children’s party. We’ll traumatize them. Then we’ll have angry parents *and* child psychologists after us.” “So… no masks. But I *did* bring noise-canceling earmuffs. In case they do that shriek thing.” “The shriek thing? Bill, we’re robbing a birthday, not infiltrating a banshee colony.” “They’re unpredictable, Molly! One minute they’re blowing on a kazoo, the next they’re sobbing because the cupcakes aren’t blue enough. We need contingency plans.” “Contingency plan: don’t make eye contact, move fast, and for God’s sake, if a game of ‘Pin the Tail on the Donkey’ starts, *run*.” “Agreed. Also, I’ve studied the layout. Presents are piled under the ‘Joy Arch,’ near the ball pit.” “Which is code for biohazard zone.” “I brought hand sanitizer. In a clown squirt flower.” “You’re a menace.” “I’m prepared.” “How many presents are we talking?” “At least twenty. Cash, gift cards, one of those robotic dogs that growls when you scold it—solid resale.” “And if the parents are still there?” “Then you do your ‘worried aunt’ routine. Works every time.” “I’m not doing the ‘worried aunt’ routine. Last time, I got roped into explaining the birds and the bees to a seven-year-old.” “Tell them the stork has union benefits now. Keep it brief.” “We stick to the plan. In and out before ‘The Wobble Worm.’ That’s our window.” “Got it. Cake chant, grab gifts, exit before interpretive parachute play.” “And if a magician shows up?” “Magicians are our allies. They misdirect for a living. I’ll toss him a twenty, we’re golden.” “You’re bribing a children’s magician?” “Priorities, Molly.” A high-pitched voice suddenly echoed from inside: “And—NOW—EVERYONE SING!” Molly ducked lower. “That’s it. They’re cutting the cake!” Bill adjusted his flower and cracked his knuckles. “Time to be party heroes.” “You mean party *villains*.” “Semantics. You ready?” Molly took a deep breath. “Only if you promise not to start a conga line.” “I can’t promise that. Joy is my kryptonite.”
created_at
"2026-03-25T10:27:32.336Z"
3/25/2026, 10:27:32 AM
model
"qwen/qwen3-235b-a22b-07-25"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
157
moderation_latency
(null)
generation_time
37377
tokens_prompt
77
tokens_completion
978
native_tokens_prompt
72
native_tokens_completion
800
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0004872
router
(null)
provider_responses
0
endpoint_id
"043eb28f-fc28-4b0d-9800-b249fcfcfbf9"
id
"chatcmpl-9d44bafcc4f84526"
is_byok
false
latency
157
model_permaslug
"qwen/qwen3-235b-a22b-07-25"
provider_name
"Parasail"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1774434452-j4Mk58wgqq0W1OpC3jOh"
upstream_id
"chatcmpl-9d44bafcc4f84526"
total_cost
0.0004872
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Parasail"
Evaluation details
Result
Evaluator
Details
Meta Data
2.01%
Matches word count
n/a
neededClean
false
words
525
0.00%
Dialogue to Total Word Ratio
Ratio: 86.65%, Deviation: 16.65%
neededClean
false
wordsTotal
532
wordsDialogue
461
1.0058%