NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ78CFNZEKQ9W4GG554FA54A
Run Details
Model:
Qwen 3.6 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
54.4s
Cost:
$0.015720
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
39.50%
Dialogue to Total Word Ratio
(details)
19.7483%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her utility belt, loaded with tools, a grappling hook, and a half-eaten bag of gummy worms. Bill, wearing a teal balaclava, peered over the fence. "Target acquired," Bill whispered. "The cake is uncut. Molly, this is the heist of the century." "Focus," Molly hissed. "Four minutes until the host cuts the sponge. That's our window. Did you memorize the guard rotation?" "Guard rotation? I can infiltrate that yard like a ninja wearing a clown nose. The acoustics are fine." Bill lowered his binoculars and winked. "Let's be honest, most people would call us villains. "Most people don't have a debt hole big enough to swallow our rent," Molly recited, priming a smoke canister disguised as a juice box. "The gifts are just gathering dust anyway. We're recycling." "Aggressively," Bill echoed, checking his velcro gloves. "Distraction confirmed? If Henderson asks about a man in a tutu behind the hedge, I need a cover story." "Tell him you're a retired aerobics instructor demonstrating senior hula-hooping," Molly said. "Move. The balloon vendor is drifting too close. If we float away on a pony, I'm leaving you." "The dog likes me," Bill insisted. Bill scrambled over the fence, his shoes quacking like a flock of ducks. He landed near the patio, scrambling toward a mountain of presents. "Stealth mode active," he hissed. "Lego sets. Dollbones. Socks. I love socks." Molly dropped onto the dewy grass, checking her watch. "High value first. Electronics. Bikes. Strip the frame and leave the toys behind. We need delayed detection. If they open a box and see it's empty, we blow cover." Bill began stuffing gift bags into his duffel. "What about the big box? The one with the dinosaur wrapping?" "Leave it," Molly said, yanking a remote-control car from a stack. "Henderson is turning. Chop." "Running!" Bill shouted, clutching a bicycle under one arm and a plush unicorn under the other. "We're rich, Molly! We're buying an island of chocolate milk!" "We're buying freedom!" Molly yelled, shoving the RC car into her vest. "Get in the van before the clown notices!"
created_at
"2026-04-27T10:39:57.641Z"
4/27/2026, 10:39:57 AM
model
"qwen/qwen3.6-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
619
moderation_latency
(null)
generation_time
54181
tokens_prompt
77
tokens_completion
10028
native_tokens_prompt
76
native_tokens_completion
10467
native_tokens_completion_images
(null)
native_tokens_reasoning
9957
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0157195
router
(null)
provider_responses
0
endpoint_id
"2435bec9-c808-4f27-9fc8-12f3460d7b66"
id
"chatcmpl-ae81b0d9-f0f8-9d98-9044-2d2b65fee1b4"
is_byok
false
latency
619
model_permaslug
"qwen/qwen3.6-flash"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777286397-BieF2xfOqdjGY2htjIn3"
session_id
(null)
api_type
"completions"
id
"gen-1777286397-jkNd2SBQLzmtNt2msgLJ"
upstream_id
"chatcmpl-ae81b0d9-f0f8-9d98-9044-2d2b65fee1b4"
total_cost
0.0157195
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
345
39.50%
Dialogue to Total Word Ratio
Ratio: 54.31%, Deviation: 4.31%
neededClean
false
wordsTotal
348
wordsDialogue
189
19.7483%