NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ7A42Z261JWF30W4NMXVT8E
Run Details
Model:
Qwen 3.6 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
53.7s
Cost:
$0.015622
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
3.62%
Matches word count
(details)
64.59%
Dialogue to Total Word Ratio
(details)
34.1071%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"You're sure the balloon penguin isn't a snitch?" Molly whispered, adjusting her invisible mask. "Babe, he's rubber. If he were a snitch, he'd hold a notepad, not a duck. Check the manifest. Forty-two boxes under the oak. No lasers, just gluten-free cupcakes. Our jam." "I don't care about the cupcakes. What's the exit? We can't walk out with a mountain bike and a robot dog. The host mom is juggling keys near the charcuterie. We need a distraction. The cake?" "Right. Cake distraction. You walk in with a novelty slice, trip, and release the confetti cloud. While they sneeze, I scoop the loot into the vortex bag. We leave through the doggy door." "The doggy door? One of us carries the trampoline ride-on. It's wide. We'll get wedged in the kennel. Two thieves stuck in a dog run waiting for a vet with a butter dish is not a heist; it's a plumbing error." "Fine. Balloon animal exit. You deflate, I deflate, we roll like sausages of justice." Molly wiped sweat from her sequined collar. The wind whipped her tattered cape around her knees as she peered over the low garden wall. Beyond the chain-link fence, the thumping bass of the children's party vibrated through the soles of her stolen Crocs. She shifted the weight of the grappling hook launcher on her shoulder, wincing as the metal dug into her shoulder blade. Bill crouched beside her, frantically tapping on a tablet taped to a hollowed-out sandwich. He paused, squinting at a line of code scrolling across the cracked screen, before shoving the device into his pocket and scanning the perimeter with binoculars that were clearly just toilet paper rolls painted silver. They exchanged a glance that communicated their shared lack of professional preparation and mutual belief that this would, miraculously, go according to plan. Molly checked her watch, noting the second hand twitching like a nervous spider, and gave a sharp nod to her partner. "Rolling is undignified. What about the rabbit?" "What rabbit? No decoys. We use the inflatable T-Rex. You ride the dinosaur, I drive the helicopter. We bolt while the kids think the entertainment paid an early visit." "The van is too small. We wanted the hearse. Mom's hearse has coffin legroom. We stuff the presents in the back, pretend we're delivering a fatality of extreme joy." "The hearse has a siren. Cops show up. Cops want paperwork. You hate paperwork. You filled the ice cream form wrong last time and held us in a cell for three hours because the warden thought you were a rogue dairy farmer." "I said dairy manager! And the hearse is perfect. Social contract law. No one stops a hearse; they move for death. Stealth mode." "Can you lock the siren?" "I hot-wired it to play 'Happy Birthday' backwards. Disorienting but non-police. Psychological warfare for toddlers. Just say the word, partner." Bill grinned, revealing a cavity shaped like a dollar sign. "Bill, don't drop the bag!" "Only if I'm the magician." "Deal. You're the magician. I'll draw mustaches until they flee. That's the real magic. Parents are toasting. Cue. One. Two." "Wait, three?" "Three! Roll!"
created_at
"2026-04-27T11:10:19.624Z"
4/27/2026, 11:10:19 AM
model
"qwen/qwen3.6-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
617
moderation_latency
(null)
generation_time
53646
tokens_prompt
77
tokens_completion
9755
native_tokens_prompt
76
native_tokens_completion
10402
native_tokens_completion_images
(null)
native_tokens_reasoning
9654
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.015622
router
(null)
provider_responses
0
endpoint_id
"2435bec9-c808-4f27-9fc8-12f3460d7b66"
id
"chatcmpl-56386d44-9de3-9b8b-bddd-97548899bc0e"
is_byok
false
latency
617
model_permaslug
"qwen/qwen3.6-flash"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777288219-OCtKwn3r4XPG5np9t8HO"
session_id
(null)
api_type
"completions"
id
"gen-1777288219-V5DJcOZ4FLhBuqLkhOX8"
upstream_id
"chatcmpl-56386d44-9de3-9b8b-bddd-97548899bc0e"
total_cost
0.015622
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
3.62%
Matches word count
n/a
neededClean
false
words
524
64.59%
Dialogue to Total Word Ratio
Ratio: 67.04%, Deviation: 2.96%
neededClean
false
wordsTotal
531
wordsDialogue
356
34.1071%