NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ7089A9047RT3BMJ6BY8R8Y
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
2m 28s
Cost:
$0.021748
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
0.03%
Dialogue to Total Word Ratio
(details)
37.5917%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unrolled the blueprint across the sticky laminate of the diner booth, using a ketchup packet and a lukewarm mozzarella stick to anchor the corners against the draft from the kitchen door. The schematic depicted the Johnson residence with obsessive detail, complete with red arrows tracing a route through the overgrown azaleas to the patio, and a frantic doodle of a lightning bolt marking the outdoor lighting fuse box. Colored pins punctured the paper to denote zones of high risk; a yellow pin marked the bouncy castle, a hazard due to static electricity buildup, while a stark black pin indicated the location of the grandfather clock, known to chime with the subtlety of a church bell. Bill shifted in the vinyl seat, the plastic buckles of his tactical vest clicking nervously. The vest was a monstrosity of repurposed gear, straining against his floral shirt and loaded with gadgets including a grappling hook fashioned from bent coat hangers, a laser pointer that emitted a sad red dot, binoculars that leaked grease, and three packets of instant oatmeal pilfered from a breakfast buffet heist three years prior. Each oatmeal packet was labeled with a scrawled note regarding caloric density and morale boosting properties. Molly watched him fiddle with a carabiner attached to a plastic spade, her eyes narrowing as she calculated the odds of his incompetence compromising the extraction phase. She adjusted her own equipment, a nondescript tote bag containing the decoy presents constructed from weighted socks and a roll of duct tape that had seen better decades. The diner around them hummed with the mundane rhythm of late-afternoon indifference, a waitress refilling coffee cups and a man in the corner arguing with a crossword puzzle, oblivious to the high-stakes planning occurring in booth four. "The package is primed," Bill whispered. "Target is the Super-Doodle-3000." Molly scraped gummy bear stickers off her water glass, her expression flat. "Buster is a pug. Thirty-five-minute window." Bill produced binoculars and peered through them upside down. "We weaponize the piƱata. I distract, you loot." "We failed the magician audition," Molly countered, snatching the binoculars and shoving them into his utility belt. "We are catering. Swap presents, exit the gate. Do you have the decoys?" "Decoys are ready," Bill replied, tapping his chest rig. "Sock weights." Bill slumped, the vest groaning under the weight of his disappointment. "Hairnet lacks ballistic protection. I have pickles." "Pickles stay jarred," Molly hissed, checking her watch, its face scratched but the second hand sweeping with ruthless precision. "Kick the gate, grab the box. Stammer if asked." Bill nodded, adjusting his night-vision goggles in the bright diner light until the world turned a grainy green and he immediately bumped his forehead against the sugar dispenser. Molly groaned, signaling for the check with a sharp wave that nearly knocked over the salt shaker. "Move quietly. No wrappers. Grab the Doodle. Leave snacks. If you fail for juice, I leave you."
created_at
"2026-04-27T08:18:02.688Z"
4/27/2026, 8:18:02 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
741
moderation_latency
(null)
generation_time
136422
tokens_prompt
77
tokens_completion
9598
native_tokens_prompt
76
native_tokens_completion
9049
native_tokens_completion_images
(null)
native_tokens_reasoning
8381
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.021748
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-9237e72b-7eeb-9a5a-a519-7afe9f2cb30c"
is_byok
false
latency
736
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777277882-pKkRwjNQpfBE0r766bfk"
session_id
(null)
api_type
"completions"
id
"gen-1777277882-91wOSjph3SMK6LAJgb8u"
upstream_id
"chatcmpl-9237e72b-7eeb-9a5a-a519-7afe9f2cb30c"
total_cost
0.021748
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
487
0.03%
Dialogue to Total Word Ratio
Ratio: 17.21%, Deviation: 12.79%
neededClean
false
wordsTotal
494
wordsDialogue
85
37.5917%