NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK3C9Q6YRZR0JCCVCXD9NSZ
Run Details
Model:
Qwen 3.5 397B A17B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
3m 56s
Cost:
$0.028896
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the high-powered binoculars, squinting intently at the inflatable castle glowing ominously in the suburban dusk. The plastic turret squeaked softly in the wind. "Target acquired," she whispered, voice tight with professional gravitas. Bill chewed on a wooden toothpick, leaning back in the cramped van seat. The engine ticked as it cooled down nearby. "That's a bouncy house, Molly. Not Fort Knox." "Intel suggests high-value loot. LEGO sets, branded sneakers, thick cash envelopes." "You want us to rob a six-year-old's birthday party?" "I want us to retire. Look at that security perimeter." Bill laughed, a dry, rasping sound. "It's a clown. The entire security detail is a single clown." "Clowns are unpredictable. They have balloons. Distraction potential is statistically high." "This is insane. We hit banks. We hit jewelry stores. We don't hit toddlers." "Toddlers don't have alarm systems connected directly to police dispatch," Molly countered, tapping the dirty dashboard. She wore tactical gloves despite the warmth. "They have parents. Angry, sleep-deprived parents. Those are worse than cops." "Weakness identified. Parents will be distracted by cheap wine and awkward small talk." Bill sighed heavily, wiping grease from his hands onto his tight black turtleneck. His reflection stared back from the rearview mirror. "What's the extraction plan?" "We blend in. You wear the mascot costume. I work the cake table." "I am absolutely not wearing a giant duck suit." "It's a dragon. And it's bulky. Perfect for concealing PlayStation 5s." "I look ridiculous in spandex, Mol. You know my thighs chafe something terrible." "Focus, Bill. Think of the retirement fund. Think of the white sand beaches." "I'm thinking of prison. Specifically, the yard time with people who hate child thieves." "Technically, we are liberating assets from unused inventory." "They unwrapped them. They own them now legally." "Possession is nine-tenths of the law. The other tenth is brute force." Bill looked out the dirty window at a mother chasing a small child with a napkin. The suburban street remained quiet otherwise. "That woman looks fierce." "She's exhausted. Her reflexes are significantly compromised." "And the clown is juggling actual chainsaws." "Wait, what?" Bill pointed a calloused finger. "See? Fire breathing too." Molly lowered the binoculars, frowning deeply. "That violates safety code four-oh-four." "It violates the sanity code. Let's go rob a casino instead." "Casinos have cameras everywhere. Kids have sticky fingers. We can lose them in the crowd." "I'm not fighting a ten-year-old for a Nerf gun." "You have tactical training." "I have a bad knee and a functioning conscience." "Fine. We split the loot seventy-thirty." "I want sixty. I'm wearing the dragon suit." "Deal. Gear up. Party starts in ten minutes." Bill groaned, reaching for the large spandex bag. The zipper sounded loud in the silence. "I hate children." "Love the money, Bill. Just love the money." Molly checked her watch, smiling as the castle lights flickered. Shadows lengthened across the empty lawn.
created_at
"2026-02-16T11:33:07.668518+00:00"
2/16/2026, 11:33:07 AM
model
"qwen/qwen3.5-397b-a17b-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
847
moderation_latency
(null)
generation_time
236028
tokens_prompt
77
tokens_completion
7700
native_tokens_prompt
76
native_tokens_completion
8014
native_tokens_completion_images
(null)
native_tokens_reasoning
7286
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.028896
router
(null)
provider_responses
0
id
"chatcmpl-e0aaf74e-f5b6-9d3e-8354-a345d304e8e1"
status
200
is_byok
false
latency
847
endpoint_id
"180484a2-e24b-495a-ba41-b34b680d5a9f"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-397b-a17b-20260216"
api_type
"completions"
id
"gen-1771241350-r4ZPe98fpELLX2HtLWMf"
upstream_id
"chatcmpl-e0aaf74e-f5b6-9d3e-8354-a345d304e8e1"
total_cost
0.028896
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
97.6276%
Matches word count
n/a
neededClean
false
words
493
81.0702%
Dialogue to Total Word Ratio
Ratio: 67.95%, Deviation: 2.05%
neededClean
false
wordsTotal
493
wordsDialogue
335
89.3489%