NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ7837XSF40QDQ1PY8KH80N2
Run Details
Model:
Qwen3.6 Max Preview
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4m 53s
Cost:
$0.081234
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The architectural schematics covered the kitchen island, anchored by a stale donut and a crumpled invitation featuring a cartoon unicorn. Molly adjusted her tactical vest and traced a highlighter across the suburban floor plan. Bill hovered nearby, inspecting a plastic party hat with profound professional skepticism. He turned the cardboard cone over in his calloused hands, weighing its structural integrity against their upcoming operation. The refrigerator hummed steadily while a distant lawnmower droned through the open window, completely unaware of the criminal conspiracy unfolding over checkered linoleum. Morning light caught the dust motes dancing above the stolen blueprints. A calendar on the wall displayed circled dates marking previous successful jobs across three counties. We enter during musical chairs, Molly whispered, marking the foyer. Parental vigilance collapses when children shove for seats. Bill dropped the hat and crossed his arms. He studied the highlighted zones with genuine tactical concern, mentally mapping sightlines and potential choke points near the snack bowls. He adjusted his collar and exhaled slowly. I worry about the clown. Professionals recognize professionals. If he spots us, he blows a whistle. Our timeline dissolves. The entertainer is strictly hired help, Molly countered. He arrives at three to perform mediocre puppetry. We utilize that distraction. We sweep the presents into tote bags and exit through the gate. Bill rubbed his temples, clearly calculating the psychological toll of the mission. He paced across the tile floor, avoiding a scattered pile of stolen hotel keycards and expired coupons. The ceiling fan wobbled overhead, casting rotating shadows across his worried expression. What about the pinata? Suspended objects are liabilities. One swing ruined my knee in Brussels. I refuse to navigate a stampede for plastic watches. I already neutralized that variable, Molly said. The child demands an early strike. The candy avalanche creates a diversion near the patio. We grab the merchandise and disappear. Bill sighed and picked up a rolled map. Acceptable. But if a kid tackles me, I keep the helicopter. I have boundaries. Molly snapped her highlighter cap shut and gathered the documents into a weathered leather satchel. She checked her wristwatch, noting the synchronized countdown with practiced efficiency. The afternoon shadows stretched across the linoleum, marking the final moments of peaceful preparation. A neighbor dog barked twice, signaling the approaching hour. She tightened her bootlaces and checked the pocket knife resting against her hip. The kitchen clock ticked forward, marking the irreversible commitment to the afternoon agenda. She zipped the bag and straightened her posture. Wear the oversized sneakers and memorize the escape route. We deploy in fifteen minutes. Do not engage with the magic show.
created_at
"2026-04-27T10:34:54.784Z"
4/27/2026, 10:34:54 AM
model
"qwen/qwen3.6-max-preview-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1441
moderation_latency
(null)
generation_time
292092
tokens_prompt
77
tokens_completion
10483
native_tokens_prompt
76
native_tokens_completion
10402
native_tokens_completion_images
(null)
native_tokens_reasoning
9846
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0812344
router
(null)
provider_responses
0
endpoint_id
"ca7fc0c7-f23d-4ac5-b2e0-945be5b61787"
id
"chatcmpl-b84484e7-c2e0-9281-a447-2608d64615f9"
is_byok
false
latency
1441
model_permaslug
"qwen/qwen3.6-max-preview-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777286094-tGUfHjYwH2H7kHsZtVVb"
session_id
(null)
api_type
"completions"
id
"gen-1777286094-83T8zMgSL52nir55Elrm"
upstream_id
"chatcmpl-b84484e7-c2e0-9281-a447-2608d64615f9"
total_cost
0.0812344
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
436
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
436
wordsDialogue
0
0.0000%