NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ79160A4N9Q4STD583AARC4
Run Details
Model:
Qwen3.6 Max Preview
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
4m 37s
Cost:
$0.081414
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
82.13%
Dialogue to Total Word Ratio
(details)
91.0662%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints covered the kitchen table, pinned by a pickle jar and a ceramic mug declaring World’s Okayest Criminal. Molly dragged a yellow highlighter across the dotted perimeter surrounding the inflatable castle. Bill adjusted his glasses and studied the crayon schematic. They had cracked vaults and hijacked armored transports, yet this operation demanded unprecedented discipline. The location was a fenced suburban yard. The payload consisted of twenty-three wrapped boxes, one unicorn pinata, and a plastic cooler stacked with apple juice. Security involved three tired chaperones and a golden retriever vulnerable to dried liver treats. Rain drummed against the windowpane while the coffee maker gurgled its final bitter drops. A city map hung crookedly on the wallpaper, marked with red string and ambition. “We breach during musical chairs,” Molly said, tapping the paper. “Maximum distraction hits at three forty-five. That is our window.” “What about the entertainer?” Bill asked, leaning closer. “Clowns are wildly unpredictable. I lost an eyebrow to a poodle balloon in ninety-eight.” “He stays anchored to the snack table until the cake appears. We use the plastic slide for extraction. You grab the boxes. I will neutralize the pinata.” “Neutralize it how?” “With a butter knife and absolute focus. We cannot allow a sugar avalanche to compromise our escape route.” “Understood. Are the cover identities ready?” “Clipboard volunteers. Lanyards, walkie-talkies, and deep generational sighs. You will vanish into the crowd.” “I still want to inspect the bounce structure first. Load distribution matters.” “It is pressurized vinyl, Bill. Just remove your sneakers before entering the loot zone. We maintain standards.” “What happens if a child spots us?” “Hand them a foil star and claim it is an official mission. Kids obey shiny bureaucracy.” “And the hound?” “Buster surrenders for cheddar. I packed the good blocks. We strike fast, we stay silent, and we vanish before the party favors drop.” “Do we need a decoy?” Bill asked. “The magician handles that. He pulls rabbits from hats. We pull electronics from wrapping paper.” “What about the parents?” “They will be arguing about sunscreen and gluten. Perfect cover.” “I feel prepared.” “Good. Check your zip ties.” “Checked.” “Sync your watch.” “Synced.” “Remember the rule.” “No tears, no tantrums, no leaving fingerprints on the frosting.” Bill folded the diagram and slid it into his jacket pocket. The refrigerator motor clicked off, leaving only the steady tick of the wall clock. Molly capped her marker and pushed the pickle jar toward the sink. They had spent weeks tracking delivery trucks, memorizing nap schedules, and practicing quiet footsteps on hardwood floors. Now the moment arrived. The quiet street waited beyond the blinds. Two veteran operators prepared to infiltrate a backyard fortress of streamers and paper plates. They adjusted their lanyards in perfect synchronization. The operation would launch at three thirty. There was absolutely zero room for mistakes. Only clean hands and a full trunk. Rain tapped the window while distant traffic hummed. Steam curled upward. Black gloves waited patiently on the counter. Time marched forward.
created_at
"2026-04-27T10:51:15.857Z"
4/27/2026, 10:51:15 AM
model
"qwen/qwen3.6-max-preview-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2672
moderation_latency
(null)
generation_time
276705
tokens_prompt
77
tokens_completion
9415
native_tokens_prompt
76
native_tokens_completion
10425
native_tokens_completion_images
(null)
native_tokens_reasoning
9743
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0814138
router
(null)
provider_responses
0
endpoint_id
"ca7fc0c7-f23d-4ac5-b2e0-945be5b61787"
id
"chatcmpl-ab5a36ff-0f54-9201-8845-c6424506d2e6"
is_byok
false
latency
2672
model_permaslug
"qwen/qwen3.6-max-preview-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777287075-SsohJzhq9TB1xb1HgcgY"
session_id
(null)
api_type
"completions"
id
"gen-1777287075-uNKSPwC3bAcyrChTYZ2Q"
upstream_id
"chatcmpl-ab5a36ff-0f54-9201-8845-c6424506d2e6"
total_cost
0.0814138
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
82.13%
Dialogue to Total Word Ratio
Ratio: 48.02%, Deviation: 1.98%
neededClean
false
wordsTotal
504
wordsDialogue
242
91.0662%