NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ74N116CJET5HNN4M52198D
Run Details
Model:
Qwen 3.6 27B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
2m 28s
Cost:
$0.027967
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tightened the strap on her tactical visor, the rubber digging into temples slick with nervous sweat. She knelt beside a sprinkler shaped like a grinning rubber duck, her fingers deftly adjusting the hydraulic pressure on a grapple gun disguised as a toy rocket. The device whirred uneasily, the motor choking on a crumb from a morning bagel before the suction cup extended with a wet pop, latching onto a garden hose. Bill crouched ten feet away, using a plastic flamingo to stabilize a tablet strapped to his knee. His matte-black stealth suit was ruined by a neon yellow sash reading Cool Aunt Bill that was rapidly peeling off due to poor adhesive, revealing a Worlds Okayest Uncle sticker underneath that he frantically concealed with a smudge of mud. Check your six, B, Molly hissed, pointing a gloved finger at the sliding glass door. Perimeter sweep complete, but threat level is spiking. That's a golden retriever wagging its tail near the fruit punch station, and it's eyeing my boots like I'm a walking steak. Bill tapped his earpiece, his voice dropping to a gravelly whisper. Copy. Canine asset noted. I have freeze-dried liver on the belt, but we hold fire unless he breaches. Right now, he's fixated on a toddler in a superhero cape who is flapping around the lawn like a broken lawn ornament. Distraction is effective, for now. Molly nodded, her gaze drifting to a massive bouncy castle looming over the fence, vibrating with the muffled shrieks of delighted horror. She checked her watch, the face painted with cartoon stars that obscured the seconds hand. A robotic unicorn on Bill's tablet screen flashed a battery warning, which he swiped away to reveal a live feed of the interior. A child in a princess costume currently sat atop the gift table, inspecting a box with intense suspicion. Bill zoomed in, his thumb sweating as the threat assessment updated on the display. We have a forty-five-minute window, Molly whispered, her breath fogging the visor until she blinked the anti-fog condenser off to save power. The balloon animal station opens at four, drawing the small targets away from the loot. If we miss that mark, we're wrestling over a cake. Nobody wins a cake fight, Bill. Bill grimaced, wrestling the sash back into place as it slipped, exposing the mud smudge. Agreed. I've mapped the stack. The LEGO sets are too heavy and scream high-profile risk. We focus on plush toys and board games. Soft targets. Low noise. High resale value on the orphan auction block. Molly raised a hand, her expression souring as she studied the reflection in the window. Red bows clustered near the juice box fortress like a minefield of disappointment. I'm seeing red bows. Mark my words, Bill. Red bow is parent code for hand-knitted socks or educational vegetables. Touch a red bow, and we walk back empty-handed with a moral lecture etched into our souls. Bill shuddered, unclipping a bungee cord from his waist. Red bows are contraband. I'd rather swallow a glow stick than unpack woolen leg warmers. What about the big box? The silver one with the rocket ship antenna? That's the decoy, Molly replied, leaning in until her goggles nearly touched the glass. It's filled with bubble wrap and a pizza coupon. Parents use that to test greed levels. Grab that, and the smoke detector triggers. Stick to the small loot. Bill coiled the bungee with practiced ease, his eyes flicking to the tablet. Operation Sock Dodge is a go. On your signal, I roll the distraction drone. The kids will swarm the robotic unicorn, creating a vacuum at the table. We slip in, grab the plushies, and vanish before the piƱata makes its first kill. Molly offered a rare, toothy grin, testing the tension on her grapple line. A small hand smudged the glass from the inside, drawing a circle in the condensation. For the glitz, Bill. And may the odds never favor the socks. For the glitz, Bill echoed, checking the edge of his novelty scissors.
created_at
"2026-04-27T09:34:43.246Z"
4/27/2026, 9:34:43 AM
model
"qwen/qwen3.6-27b-20260422"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
784
moderation_latency
(null)
generation_time
147549
tokens_prompt
77
tokens_completion
8300
native_tokens_prompt
76
native_tokens_completion
7756
native_tokens_completion_images
(null)
native_tokens_reasoning
6859
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0279672
router
(null)
provider_responses
0
endpoint_id
"82097366-3a73-4790-ac36-f96ba9f6f577"
id
"chatcmpl-b36e2000-9787-9b93-9b41-0559650df26a"
is_byok
false
latency
784
model_permaslug
"qwen/qwen3.6-27b-20260422"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777282483-PBAsqpyRyij25vJ6Lw6f"
session_id
(null)
api_type
"completions"
id
"gen-1777282483-VssP33liRsYtlLeozNRY"
upstream_id
"chatcmpl-b36e2000-9787-9b93-9b41-0559650df26a"
total_cost
0.0279672
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
675
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
682
wordsDialogue
0
0.0000%