NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ7727QP7FP0MXJ6EVZS7TC0
Run Details
Model:
Qwen 3.6 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
48.7s
Cost:
$0.016302
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
98.94%
Dialogue to Total Word Ratio
(details)
99.4712%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The vinyl tablecloth crinkled under Bill’s elbow as he unfolded the laminated blueprint of the suburban backyard. Molly adjusted her neon headband, checking her reflection near a plastic sword. She traced a fingertip over the cluster of cardboard boxes arranged beneath the pine tree. The summer atmosphere pressed down on the cracked patio stones. Grass clippings and crushed chip bags littered the perimeter. Every detail of the upcoming celebration was mapped in meticulous handwriting. The inflatable dinosaur waited by the west gate. The mother of the birthday boy stood guard near the snack table. Standard juvenile defense grid. Bill tapped a grease pencil against the laminated paper and began pacing the narrow walkway. He explained the tactical sequence with the precision of a bomb disposal expert. Molly pulled a roll of wide packing tape from her pocket and unspooled it across the table. She examined the makeshift grappling hook prototype while the afternoon light shifted through the oak branches. Her partner adjusted the calibration dial on his bulky backpack and checked the inflation hose. The entire operation depended on synchronized timing and flawless execution. A beagle named Pickles snoozed near the flower beds. His lethargy provided a critical opening. Molly tested the plastic clamps on her wrists. The vacuum siphon hummed quietly as it warmed up. Her companion reviewed the neighborhood camera angles on a handheld monitor. They calculated the exact moment the string beans would run out. The entire scheme rested on patience, stealth, and impeccable comedic timing. They checked their boots. They verified the zip ties. They synchronized their digital watches. The suburban lawn stretched out like a tactical training ground, complete with sprinkler heads and suspiciously placed wading pools. They adjusted their collars. The neighborhood kids played soccer nearby. Every detail mattered. Molly smoothed the edge of the blueprint. Bill aligned his calibration gauge. The afternoon sun glared off the glass sliding doors. Every second brought them closer to the target zone. They adjusted their gloves. They inspected their footwear. The plan required absolute discipline. Check the wind direction first. "Check the perimeter again," she said. "The inflatable dinosaur guards the west gate. Standard juvenile defense grid." "The neighbor’s camera blinks red at dusk." "So we just walk in, grab the loot, and leave before the cake hits the table?" "Precision timing is everything. We enter at six forty two. That gives us a three minute window. We cannot hesitate. We cannot make eye contact." "What about the dog?" "A beagle named Pickles. We bypass him with cheese." "The vacuum bag siphon is ready?" "Fully charged. We inflate it, descend, attach it to the branches, and hoist the cardboard mountain into the linen closet. The presents vanish. Pack the rope just in case." "This is either brilliant or deeply illegal." "It is both. Assist me with the tension calibration. We need the gentle strength of a mother lifting her sleeping toddler." "Set for gentle lifting," she reported. "Excellent. Gather the empty grocery sacks. The heist begins."
created_at
"2026-04-27T10:16:53.248Z"
4/27/2026, 10:16:53 AM
model
"qwen/qwen3.6-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
612
moderation_latency
(null)
generation_time
48511
tokens_prompt
77
tokens_completion
10665
native_tokens_prompt
76
native_tokens_completion
10855
native_tokens_completion_images
(null)
native_tokens_reasoning
10221
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0163015
router
(null)
provider_responses
0
endpoint_id
"2435bec9-c808-4f27-9fc8-12f3460d7b66"
id
"chatcmpl-9bc09daa-bea6-9556-8a4c-fec045ba230c"
is_byok
false
latency
612
model_permaslug
"qwen/qwen3.6-flash"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777285013-6hb5MA4JNUxSU9ilQnpQ"
session_id
(null)
api_type
"completions"
id
"gen-1777285013-odtc4ZKWF2ILl1Vp2ViX"
upstream_id
"chatcmpl-9bc09daa-bea6-9556-8a4c-fec045ba230c"
total_cost
0.0163015
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
499
98.94%
Dialogue to Total Word Ratio
Ratio: 30.46%, Deviation: 0.46%
neededClean
false
wordsTotal
499
wordsDialogue
152
99.4712%