NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ7977HB1NWZAD2VS5XT9WPA
Run Details
Model:
Qwen 3.6 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
45.8s
Cost:
$0.014311
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
82.41%
Dialogue to Total Word Ratio
(details)
75.2548%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"You realize we require tactical precision and pastel camouflage," Molly stated, adjusting her neon visor. She measured the carpeted hallway with deliberate steps, checking door clearance. The dim storage room smelled faintly of sugar and desperation. Dust motes danced in the single overhead bulb as they reviewed the floor plan. Bill smoothed grease-stained blueprints across a folding chair. "I have two options. We infiltrate as adult supervision or repackage as the inflatable bouncy castle." "The castle features excellent sightlines but terrible ventilation," she observed, cataloging party favor bags. "They hired a professional magician. Not some guy with rubber chickens." "We go as cupcakes. I secured the frosting kit." "Bill, we are professional thieves. We do not become baked goods." "Molly, we sleep behind a taffy shop. Priorities shift." "Cupcakes it is. I demand red sprinkles. Black makes me look like a felon, though I am one." She unrolled a house schematic over the worktable. Bill peeled adhesive backing from masking tape, aligning exit routes. The tactical team operates on invisible frequencies and synchronized wrist movements. Every second counts when children hold unlimited attention spans. "Noted. Who guards the wrapping paper?" "A Golden Retriever named Snickers. Treats work every time." "We possess three dental chews and a bruised grape. Inadequate." "Desperate times require desperate snacks. What is the prize?" "The entire goodie pile. Tablets, a plush dinosaur, and a LEGO cathedral." "Plastic bricks and salvation. Theology or investment?" "Both. The cake drops at seven. Parents drink wine and give speeches. We slip through the mudroom. Clear the baseboards. Bag the loot." Molly pressed her palm against the front door, monitoring street traffic. Bill fastened a carabiner to his belt, verifying sack tension. Rain began percussion against the windowpane, masking the subtle vibrations of the approaching neighborhood. "What if a child spots us?" "We activate the distraction protocol." "Which is?" "I fall near the piƱata." "You execute a controlled stumble into paper while I collect merchandise. Do not compromise my exit vector." "My fall is a masterpiece of misdirection. Parents applaud accidental comedy." "Correct. We exit through the window. We rendezvous at the van." "We park three blocks away. We avoid the carpool." "Obviously. We divide assets. I claim the dinosaur." "Why?" "It matches my scarf. Sentimental attachment." "I take the cathedral. I favor plastic architecture." Molly compressed her duffel bag while Bill inspected trash bag clips. She checked her reflection in the dim hallway mirror. Bill adjusted his powder-blue latex gloves. "Sentimentality complicates inventory management." "Moving day becomes difficult. Dinosaur for me. Cathedral for you. Tablets for me." "Lithium cells?" "Naturally. I require phone power to document the operation." "Do not upload footage. Law enforcement monitors social media. Party organizers track digital footprints." "We go offline. Destroy the disguises. Abandon the frosting. Forget the dog." "Seven o'clock. Cupcake formation. Strike after the final toast." "Molly?" "Yes?" "The red sprinkles complement your gear." "Check your grip tape, Bill."
created_at
"2026-04-27T10:54:34.033Z"
4/27/2026, 10:54:34 AM
model
"qwen/qwen3.6-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
593
moderation_latency
(null)
generation_time
45714
tokens_prompt
77
tokens_completion
9613
native_tokens_prompt
76
native_tokens_completion
9528
native_tokens_completion_images
(null)
native_tokens_reasoning
8813
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.014311
router
(null)
provider_responses
0
endpoint_id
"2435bec9-c808-4f27-9fc8-12f3460d7b66"
id
"chatcmpl-f932a0e9-1387-9ae0-84b8-06332567f301"
is_byok
false
latency
593
model_permaslug
"qwen/qwen3.6-flash"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777287274-dkrq0l0z0uWKG9Hs9Luj"
session_id
(null)
api_type
"completions"
id
"gen-1777287274-dEmueCckFxUtrDQPhezs"
upstream_id
"chatcmpl-f932a0e9-1387-9ae0-84b8-06332567f301"
total_cost
0.014311
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
486
82.41%
Dialogue to Total Word Ratio
Ratio: 68.03%, Deviation: 1.97%
neededClean
false
wordsTotal
488
wordsDialogue
332
75.2548%