NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ77DG5HB1AZXCQQG91T3D2F
Run Details
Model:
Qwen3.6 Max Preview
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5m 11s
Cost:
$0.078192
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
40.6364%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The laminated floor plan resembled a tactical nightmare illustrated with glitter glue. Molly flattened the paper across the dented hood, tracing a path past the lemonade station with a grease pencil. Bill adjusted his tactical vest, stuffed with bubble wrap and spare zip ties, and studied the diagram through smudged binoculars. The objective was a sixth birthday celebration for a child named Timothy, whose grandparents possessed expensive toys and zero perimeter security. Molly circled a towering stack of wrapped boxes near the sliding doors. The payload was substantial. The operational risks were deeply humiliating. She checked her watch. Bill tapped the bouncy house. We breach through the inflatable structure. I will trigger a deflation on the left flank. You bypass the entertainer and secure the packages. Molly rubbed her forehead. The evening breeze carried the sound of screaming children and a speaker playing nursery rhymes. She watched a cat stalk a cupcake wrapper near the fence. You actually want to attack a bouncy castle. Bill adjusted his gloves with theatrical precision. I want to neutralize a structural hazard. The clown remains the primary threat. Those shoes enable silent maneuvering. Molly stared at the diagram. The house featured pastel shutters and a frog welcome mat. He is twisting balloon dogs, Bill. Not loading tactical weaponry. Bill ignored her skepticism and opened the trunk. Misdirection works regardless. I will wear the badger costume. Children adore mammals. They will surround me. You extract the loot while I am trapped. Molly pinched the bridge of her nose. The badger outfit reeked of cedar chips and past failures. Six year olds do not swarm. They bite and kick. Bill tossed a duffel bag onto the pavement. I installed kevlar lining. Focus on the corridor. The wagon sits behind the compost. We load the merchandise and disappear. Molly examined the heavy canvas bag. It contained thick rope, a flashlight, and three pouches of fruit snacks. What about the adults guarding the patio doors. Bill checked his reflection in the side mirror. Adults collapse under social pressure. I mailed fake invitations to a fence meeting. They will argue for an hour. Molly zipped the bag and listened to distant laughter. Streetlights flickered on, casting long quiet shadows across the driveway. You actually weaponized local bureaucracy. Bill nodded solemnly and handed her a pair of driving gloves. I weaponized tedium. Check the ignition. The hunt begins soon. Molly slid behind the wheel and turned the key. The engine sputtered before catching with a ragged roar. She watched Bill pull the badger mask over his head, transforming a seasoned felon into a deranged forest creature. The cul-de-sac remained still, lined with parked sedans and discarded tricycles. Beyond the hedgerow, children fought over plastic chairs. Molly shifted into gear and eased forward. The operation was active. The marks were elementary students. The bounty featured cartoon dinosaurs. The escape vehicle required new brakes. Nothing about this felt dignified, but the landlord demanded payment by Friday. Crime rarely offered glamour, yet it delivered results. She pressed the accelerator and rolled toward the inflatable perimeter.
created_at
"2026-04-27T10:23:02.327Z"
4/27/2026, 10:23:02 AM
model
"qwen/qwen3.6-max-preview-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1195
moderation_latency
(null)
generation_time
311222
tokens_prompt
77
tokens_completion
9640
native_tokens_prompt
76
native_tokens_completion
10012
native_tokens_completion_images
(null)
native_tokens_reasoning
9338
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0781924
router
(null)
provider_responses
0
endpoint_id
"ca7fc0c7-f23d-4ac5-b2e0-945be5b61787"
id
"chatcmpl-4f0a4c52-9e3d-9cfb-8204-f78e45b6eb75"
is_byok
false
latency
1195
model_permaslug
"qwen/qwen3.6-max-preview-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777285382-IIGFAPZUfVzQzxVNvhvS"
session_id
(null)
api_type
"completions"
id
"gen-1777285382-GFLnmuXxPLAiD34C6pKt"
upstream_id
"chatcmpl-4f0a4c52-9e3d-9cfb-8204-f78e45b6eb75"
total_cost
0.0781924
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
512
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
514
wordsDialogue
0
40.6364%