NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ76VJSHNA20030TY2769NRW
Run Details
Model:
Qwen3.6 Max Preview
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
4m 11s
Cost:
$0.068871
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
6.09%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
3.0453%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprint was spread across the damp kitchen table, anchored by a jar of pickles and a stress ball shaped like a grenade. Molly traced a red marker along the dotted line leading to the backyard patio. Bill adjusted his reading glasses and squinted at the crayon-drawn floor plan. He tapped a finger against the sketch of a bouncy castle. The paper was stained with juice and glitter. It smelled faintly of vanilla frosting and poor security. Molly clicked her pen and waited for him to speak. The refrigerator hummed. A distant siren wailed. Bill sighed and pushed his glasses up his nose. He looked tired. He looked like a man who had spent too many years rappelling down elevator shafts and not enough time sleeping. This job was beneath them. This job was also paying rent. Target window opens at fourteen hundred, Molly said. The birthday boy unlocks the gate for the clown. That is our entry point. The clown, Bill said. You mean the balloon specialist. I mean the distraction. Focus. The presents are staged in the sunroom. Wrapped, tagged, and completely unguarded. Except for the mothers. Mothers are compromised by potato salad and gossip. They will be deaf and blind. What about the dog? A corgi named Mr. Waffles. Neutralized with premium cheddar. And the children? Chaos engines. We strike during the piñata breach. While they swing at the donkey, we extract the goods. You mean stuffed animals and building bricks. I mean untraceable assets. Do you know what a vintage action figure fetches online? We used to crack bank vaults, Molly. We used to sleep in drainage pipes. This is low risk and zero lasers. Just crepe paper. The static electricity worries me. It could fry the earpieces. Then do not bounce. You handle the gift table. I will neutralize the magician. There is a magician? Amateur. I will swap his dove for a smoke pellet. Classic misdirection. And the exit route? Service gate behind the azaleas. Van idling. We vanish before the cake song. What about the piñata stick? Bill asked. Blunt force trauma. We avoid it. And the goodie bags? Leave them. We are professionals, not monsters. Fine. I will bring the extra duffel. Bring two. LEGO sets are heavier than gold bars. Check the zip ties. Shave your face. We move at dawn. Bill rubbed his temples. The marker had bled through the paper, leaving a crimson stain over a drawing of a lopsided cake. He pictured himself in a black turtleneck, crawling past discarded juice boxes, hoisting a canvas sack of board games over his shoulder like a seasonal burglar. It was undignified. It was reckless. It was also the only contract they had seen in eight months. Molly rolled the schematic and tucked it under her arm. She checked her watch. The second hand swept forward with ruthless efficiency. Bill stared at the pickle jar. He wondered if the corgi preferred sharp or mild cheese. He supposed he would discover the answer tomorrow. He stood up and stretched his aching back. The floorboards creaked beneath his boots. He grabbed his coat from the hook.
created_at
"2026-04-27T10:13:15.195Z"
4/27/2026, 10:13:15 AM
model
"qwen/qwen3.6-max-preview-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1136
moderation_latency
(null)
generation_time
250642
tokens_prompt
77
tokens_completion
8212
native_tokens_prompt
76
native_tokens_completion
8817
native_tokens_completion_images
(null)
native_tokens_reasoning
8110
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0688714
router
(null)
provider_responses
0
endpoint_id
"ca7fc0c7-f23d-4ac5-b2e0-945be5b61787"
id
"chatcmpl-0e3b61e0-18ae-944c-bb6e-f632b9519b67"
is_byok
false
latency
1136
model_permaslug
"qwen/qwen3.6-max-preview-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777284795-GC3R31sqK2Y138s3fJtx"
session_id
(null)
api_type
"completions"
id
"gen-1777284795-4xHRaS17IwuCdYlOO6Fw"
upstream_id
"chatcmpl-0e3b61e0-18ae-944c-bb6e-f632b9519b67"
total_cost
0.0688714
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
6.09%
Matches word count
n/a
neededClean
false
words
523
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
524
wordsDialogue
0
3.0453%