NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ79M8E2DKSB21G0VP8CTFMC
Run Details
Model:
Qwen3.6 Max Preview
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3m 59s
Cost:
$0.064067
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
93.65%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
46.8248%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints covered the kitchen table, anchored by a stale pretzel bag and a sticky juice box. Molly adjusted her glasses and tapped a red marker against the laminated backyard diagram. Bill paced behind her, wearing a black turtleneck that was far too warm for July while practicing his stealth walk on the linoleum. He looked less like a seasoned cat burglar and more like a man carefully avoiding scattered toys. We enter at sixteen hundred hours, Molly said, drawing a thick arrow toward the inflatable castle. The merchandise is clustered near the snack table. Surveillance confirms eighteen wrapped boxes and two large gift bags. Check your shoelaces before we depart. Bill stopped pacing and leaned over her shoulder, squinting at the paper. He traced the proposed route with a calloused finger, nodding slowly as if evaluating a bank vault instead of a suburban birthday gathering. Sweat beaded on his forehead, but he refused to remove the turtleneck. Professional standards mattered greatly. He adjusted his leather gloves and cracked his knuckles while studying the escape route through the side gate. What about the entertainer, Bill asked, dropping his voice to a dramatic whisper. Reports mention a professional clown with balloon swords and aggressive face paint. That variable concerns me greatly. I cannot work near oversized shoes. Molly capped her marker with a sharp click. She expected this hesitation. Bill always worried about the performers. Last week he panicked over a magician at a corporate picnic. She slid a second page from her folder, revealing a color-coded timeline. The laminated sheets gleamed under the fluorescent kitchen lights, representing months of dedicated reconnaissance and careful note-taking. She absolutely refused to let a birthday entertainer derail the mission. The clown rests every forty minutes, she replied, tapping a yellow rectangle marked hydration break. We move while he sleeps behind the recycling bins. He will be completely unaware. You watch the fence line. I grab the packages using the heavy canvas sacks. Bill rubbed his jaw and studied the window. A plastic tricycle sat crooked on the driveway. A garden sprinkler clicked against the wooden fence. It was ideal cover. Nobody expects experts working in daylight among paper plates and crepe paper. He faced the table again, jaw set with determination. The neighborhood remained blissfully quiet, completely unaware of the impending catastrophe. Mail carriers delivered bills. Birds sang in the oak trees. I claim the square box with the metallic ribbon, he declared, folding his arms. Do not argue with me on this matter. My professional pride demands it. We divide the restaurant vouchers evenly. Molly sighed and scribbled on her notepad. She checked her watch and loaded the equipment, which included headphones, wire cutters, and sour candies for bribes. The scheme was straightforward. The rewards would consist entirely of building blocks and bookstore vouchers. Still, as she shouldered her bag and offered Bill a nod, she felt that classic adrenaline surge. They were experts. They had a strategy. They were ready to devastate a second grader. The sedan engine coughed to life.
created_at
"2026-04-27T11:01:40.937Z"
4/27/2026, 11:01:40 AM
model
"qwen/qwen3.6-max-preview-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1916
moderation_latency
(null)
generation_time
238965
tokens_prompt
77
tokens_completion
8678
native_tokens_prompt
76
native_tokens_completion
8201
native_tokens_completion_images
(null)
native_tokens_reasoning
7567
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0640666
router
(null)
provider_responses
0
endpoint_id
"ca7fc0c7-f23d-4ac5-b2e0-945be5b61787"
id
"chatcmpl-deeaa79f-4982-904e-b912-ba6dc57a90b8"
is_byok
false
latency
1916
model_permaslug
"qwen/qwen3.6-max-preview-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777287700-WHaw6gPtTtfdQi6X6VoV"
session_id
(null)
api_type
"completions"
id
"gen-1777287700-rEwZDqcOr9CAmxhefSlH"
upstream_id
"chatcmpl-deeaa79f-4982-904e-b912-ba6dc57a90b8"
total_cost
0.0640666
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
93.65%
Matches word count
n/a
neededClean
false
words
509
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
511
wordsDialogue
0
46.8248%