NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-25-01KMJ8YMPYCYF7QSRR4VZFWHRC
Run Details
Model:
Qwen3 235B A22B Instruct 2507
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
39.3s
Cost:
$0.000475
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
28.65%
Dialogue to Total Word Ratio
(details)
14.3252%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The alley behind Chuckles & Balloons Party Palace reeked of expired cake frosting and existential dread. Molly crouched behind a dumpster, sketching on a napkin with a sparkly pink pen she’d stolen from a five-year-old’s birthday goody bag. “Okay,” she whispered. “Front door’s guarded by a clown with a suspiciously stiff walk. I’m calling him Rubberneck.” Bill adjusted his fake mustache, which instantly peeled off and stuck to his chin like wet gum. “You think he’s ex-military?” “He’s wearing glitter suspenders, Bill. His tactical advantage is a honking nose.” “So we go in through the back? Climbing up the cake slide?” “The cake slide collapses under anything over sixty pounds. Which excludes you, by approximately forty pounds.” “I’m in peak condition!” Bill huffed. “I did twelve push-ups last Tuesday.” “And then ate a family-sized lasagna.” “That’s cardio!” he snapped. “Look, we’re not getting through the kitchen—those tiny hands open soda cans like grenades. We need a distraction.” Molly tapped her chin. “What if… I go in as a party entertainer?” “You can’t juggle. You dropped a tennis ball once and broke a museum.” “I’ll be the ‘Sad Princess.’ They love Sad Princess. She cries softly while singing about her missing pony. Kids eat it up.” “Adults call the police.” “Then you be the ‘Fun Uncle.’ You’ve got that aura.” “What aura?” “The one that says, ‘I probably shouldn’t be near children, but here I am anyway.’” Bill straightened. “I resemble that remark.” “Good. You’ll bring a giant gift basket—full of fake presents rigged with expanding foam. When the kids tear them open, pop—everything’s stuck. While they’re screaming about their ruined Barbies, I swipe the real loot.” “Why not just grab the presents during the foam chaos?” “Because,” Molly said pointedly, “a three-year-old with a glue gun once took down Interpol.” “Fair.” “And no eating the cupcakes this time.” “I wasn’t eating them! I was… taste-testing for poison!” “That was three parties ago, Bill. Let it go.” A muffled kazoo rendition of “Happy Birthday” erupted from inside. “We’re running out of time,” Molly muttered. “The piñata’s at three, and after that, naptime. No naps. No thievery.” “What if I distract them with magic tricks?” “You tried that at the last one. You made a rabbit disappear—permanently.” “He was asking for it.” “Stick to the plan. You enter at 2:47, right after the limbo contest. I’ll be waiting near the treasure chest full of dollar-store prizes.” “And the foam?” “In the bowling trophy under the fake fern. Don’t touch the fern. It squeaks.” Bill frowned. “Why does a fern squeak?” “It’s fake, Bill. And haunted.” “Of course it is.” They bumped fists—Molly’s gloved, Bill’s covered in what looked like chocolate but was probably regret. “One more thing,” Molly said. “What?” “If you see a kid in a T-Rex costume… run.” “Why?” “Because last time, that kid bit through a steel dog tag.” Bill paused. “Was it the same kid?” “No. But the costume’s cursed.” Bill nodded, solemn. “Then no T-Rex. Got it.” Another burst of squeaky music. The clown stepped outside, lit a tiny cigarette, and exhaled sparkles. Molly checked her watch. “Showtime.” They crept forward, Bill struggling to reattach his mustache with cake icing. “Remind me again why we rob kids’ parties?” “Because,” Molly said, “no one ever suspects the Sad Princess.”
created_at
"2026-03-25T10:34:25.89Z"
3/25/2026, 10:34:25 AM
model
"qwen/qwen3-235b-a22b-07-25"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
563
moderation_latency
(null)
generation_time
39293
tokens_prompt
77
tokens_completion
1029
native_tokens_prompt
72
native_tokens_completion
808
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00047512
router
(null)
provider_responses
0
endpoint_id
"cf186489-d252-4fb7-aca9-87ef7b557eff"
id
"2a63082f4cf51a35c81cba1ef8ae3365"
is_byok
false
latency
563
model_permaslug
"qwen/qwen3-235b-a22b-07-25"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1774434865-7dZjT3sv0KM9ebmhutRZ"
upstream_id
"2a63082f4cf51a35c81cba1ef8ae3365"
total_cost
0.00047512
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
556
28.65%
Dialogue to Total Word Ratio
Ratio: 75.00%, Deviation: 5.00%
neededClean
false
wordsTotal
564
wordsDialogue
423
14.3252%