NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ78HYKNH7806NW151BNYS55
Run Details
Model:
Qwen3.6 Max Preview
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5m 35s
Cost:
$0.094947
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly smoothed the crayon sketch across the dented hood of the sedan. Bill adjusted his black turtleneck and checked his tactical wristwatch. The afternoon sun glared off the polished metal. A warm breeze carried the distant sound of kazoos and screaming toddlers. Molly circled the bouncy castle zone with a red pen. The ink bled slightly into the cheap paper. A stray balloon drifted past the driveway. Molly crushed a leftover juice box under her boot. Walk me through the perimeter again, Molly said. Chain link fence, four feet high, topped with plastic birthday flags, Bill replied. I can clear it if I avoid the balloon arch. Do not trip on the arch. We are seasoned professionals. What about the guard dog? Golden retriever named Biscuit. I brought premium beef jerky. He will be completely neutralized before we reach the patio stones. Excellent. Primary entry point? Sliding glass door. Left unlocked because Brenda trusts humanity. I slip inside, bypass the inflatable castle, and hit the gift table. The table is the main objective. How many wrapped packages? Twenty-three boxes, six tissue bags, and one heavy cardboard tube that I sincerely pray holds a telescope. Weight distribution matters. Bring the reinforced duffel. What about civilian presence? Twelve children, four parents on wine duty, and a stage magician who is definitely Dave from accounting wearing a cheap velvet cape. Dave cannot stop us. What is the extraction window? Cake time. When they sing, every adult holds a lighter and every kid stares at frosting. We get exactly ninety seconds. Ninety seconds to clear the table, dodge the piñata zone, and retreat. Tight. What about the noise factor? Bubble wrap lines the duffel. Silent as fresh snowfall. Also, I purchased tap shoes. Why would you purchase tap shoes? The invitation requested festive attire. I am attending as a theatrical cat burglar. It provides social cover. It provides ankle fractures. Switch to rubber sneakers. We are stealing wrapped toys, not performing on Broadway. Fine. Sneakers. But I keep the jazz hands for the final sprint. Keep your hands on the zipper. What happens if Timmy wakes early? I deploy the decoy. Wind-up chattering teeth. Toddlers love mechanical distractions. They investigate the plastic while we vanish. Acceptable. One last major obstacle. The piñata. What about the paper dragon? It hangs directly over our exit route. One stray swing and we drown in hard candy and sticky plastic rings. I will slice the cord during extraction. Controlled drop. We leave a sugar trail like seasoned professionals. Perfect. Pack the jerky, ditch the tap shoes, and meet me at the fence at four. We are crashing this party. Do we keep the goodie bags? Only if they contain cash. Move out. Bill zipped his tactical vest and grabbed the reinforced duffel. The canvas felt heavy and promising. Should I bring the bolt cutters, Bill asked. Leave them. The fence gate uses a combination lock, Molly said. The code is Timmy birth year. Two zero one eight. Brilliant. Parents never change the defaults. They never do. Remember, we grab the big boxes first. Electronics resell faster than wooden puzzles. Understood. I will prioritize anything with batteries. Good. Check your watch. Synchronize in three, two, one. Synchronized. Let us go bankrupt a birthday.
created_at
"2026-04-27T10:42:56.763Z"
4/27/2026, 10:42:56 AM
model
"qwen/qwen3.6-max-preview-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1109
moderation_latency
(null)
generation_time
335026
tokens_prompt
77
tokens_completion
12877
native_tokens_prompt
76
native_tokens_completion
12160
native_tokens_completion_images
(null)
native_tokens_reasoning
11411
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0949468
router
(null)
provider_responses
0
endpoint_id
"ca7fc0c7-f23d-4ac5-b2e0-945be5b61787"
id
"chatcmpl-99699600-3774-95dd-8bfe-2968bf7a8e0d"
is_byok
false
latency
1109
model_permaslug
"qwen/qwen3.6-max-preview-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777286576-8AexAAtZ9Ay3NIl8TesF"
session_id
(null)
api_type
"completions"
id
"gen-1777286576-yAFJSkEY5zJDEjNICmfp"
upstream_id
"chatcmpl-99699600-3774-95dd-8bfe-2968bf7a8e0d"
total_cost
0.0949468
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
544
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
546
wordsDialogue
0
0.0000%