NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ7398KZYPTYYG0JB877PDWY
Run Details
Model:
Qwen 3.6 27B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
2m 15s
Cost:
$0.034177
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
21.6892%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain drummed against the grimy window of the cramped apartment. Molly paced the warped floorboards, snapping a plastic party hat until it broke. Bill sat on a milk crate, sharpening a butter knife on denim, watching her nervous rhythm with mild amusement. A crumpled flyer lay between them, stained with coffee and marked in red ink. This is the big one. Molly leaned over the crate. We are talking forty wrapped boxes. Minimum. Some bigger than my head. Do you know what retail value a tricycle holds in the black market. Think about it. Goldfish crackers alone fund our retirement. Bill did not look up. I counted three gift bags and a pile under the dining table. The parents hired a clown. Security will be distracted by slapstick. The apartment smelled of stale takeout and nervous sweat. A single hanging bulb flickered overhead, against peeling yellow wallpaper. Slapstick is unpredictable. Molly tapped the flyer. What if the clown notices us loading the toy chest into the van. Those balloons have surprising tensile strength. Bill sheathed the blade. We dress as early birthday guests. I found oversized polka dot suspenders in a thrift bin. You wear the glitter bow. Nobody questions a celebrating aunt. No one suspects a woman holding a juice box. Bill wiped a smudge from his glasses. He had planned this route for weeks. He knew how the clown would juggle before attempting the mandatory cup routine. I am not wearing glitter. Molly scattered confetti across the linoleum. This is a precision operation. We need shadows. We need silence. We extract the loot before cake. Kids around frosting are like caffeinated squirrels. Bill pointed at a floor plan taped to the wall. The layout is perfect. Front door guarded by a mother. Side garage open for deliveries. We pose as balloon delivery. Roll in with helium tanks. Slip under the table. Slide presents into the casing. It works. Just like the bakery job, only with more cake. It sounds like a cartoon. Molly crossed her arms. We are professionals. Professionals do not wear suspenders. Professionals use lock picks and thermal paste and calibrated distraction devices. Bill slid a velvet pouch across the crate. Inside lay novelty glasses and a drooping fake mustache. The device is a whoopee cushion rigged into a high chair. When the clown sits, the noise triggers panic. Parents rush the commotion. We grab the presents. We leave out the sliding door. We get paid. Zero risk. Mostly. Molly poked the fake mustache. It smelled like mothballs. Fine. But if anyone asks, we were never here. If the clown jingles, I am running fast. I will not die for a rubber duck army. Bill rolled duct tape around his wrists. Deal. Keep your elbows tucked. Do not unwrap the piƱata. That is non negotiable. Molly checked her cracked phone screen. Three hours. Let us go rob kindergarteners.
created_at
"2026-04-27T09:10:49.223Z"
4/27/2026, 9:10:49 AM
model
"qwen/qwen3.6-27b-20260422"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
686
moderation_latency
(null)
generation_time
135140
tokens_prompt
77
tokens_completion
10032
native_tokens_prompt
76
native_tokens_completion
9481
native_tokens_completion_images
(null)
native_tokens_reasoning
8852
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0341772
router
(null)
provider_responses
0
endpoint_id
"82097366-3a73-4790-ac36-f96ba9f6f577"
id
"chatcmpl-eccbf4f2-c3f4-9aa1-87f6-43d87ea0c637"
is_byok
false
latency
686
model_permaslug
"qwen/qwen3.6-27b-20260422"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777281049-10s4QzjJG7PG9NM1CdON"
session_id
(null)
api_type
"completions"
id
"gen-1777281049-jMNHubaXYtM8oyKHHpK6"
upstream_id
"chatcmpl-eccbf4f2-c3f4-9aa1-87f6-43d87ea0c637"
total_cost
0.0341772
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
483
wordsDialogue
0
21.6892%