NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6ZM1X3AR9ME15MSDVMT6W3
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
3m 18s
Cost:
$0.027875
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.3562%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints were sketched in wax crayon on a grease stained napkin. Molly traced the fence line with a sticky index finger. Bill leaned against a humming vending machine, tossing colorful balloons into the air. He caught them without looking. The operation demanded absolute focus. It also demanded a profound disregard for conventional morality. They stood near a cracked asphalt lot behind a strip mall. Shadows stretched long across the parking spaces. A distant siren wailed and faded. Molly smoothed the napkin against her thigh. She stared at the crude circle marking the gift table. Bill finally stopped juggling. He stuffed the balloons into a duffel bag. He adjusted his borrowed waiter jacket. The fabric smelled like cheap perfume and desperation. They knew they had one shot. The event started at ten. They would arrive at eleven. The children would be distracted by a clown named Zucchini. That was the perfect cover. We grab the toys first, Molly explained, pointing at the drawing. Focus on the electronics and the premium building sets. Ignore the books. Ignore the clothing bundles. The parents will check those receipts. We avoid tracing back to us. Bill rubbed his chin. What about the wrapping paper. Do we keep the shiny foil or strip it. Strip it immediately. Paper cuts are amateur mistakes. We use plain brown bags. They look like moving boxes. Nobody questions moving boxes. Moving boxes at a birthday party. That seems suspicious. Not if we carry them out the service door. The caterers will be packing up trash. We just blend into their routine. We become invisible background staff. Background staff with twenty thousand dollars in merchandise. Precisely. Low risk if we maintain eye contact and walk briskly. Panic triggers cameras. Calmness defeats sensors. You make it sound simple. It is simple. We enter through the kitchen. We bypass the ring light photographer. We swap the real gifts with weighted dummies. We exit before the cake cutting ceremony. Weighted dummies. You actually brought bricks. Foam padding wrapped in duct tape. Sounds heavy. Feels heavy. Deceives the eye. Works every single time. Bill nodded slowly. And the children. They will wake up tomorrow to empty tables. They will have cake. They will have ice cream. They will forget the presents by Tuesday. We will not forget the payout. Business is business. They left the parking lot behind, leaving only tire marks on the cracked pavement. The afternoon sun dipped lower, casting long silhouettes across the empty storefronts. Molly checked her reflection in a darkened window. Her disguise looked flawless. Bill adjusted his collar and smoothed his hair. He gripped the steering wheel with steady hands. The engine purred like a satisfied cat. They turned onto the main avenue, navigating past crowded crosswalks and bustling sidewalks. The destination appeared on the horizon, a sprawling house wrapped in balloons and flags. The air grew heavy with anticipation. Every second counted. The route is clear. Traffic flows smoothly. Check your pockets. Verify the exit map. Follow me closely. Stay quiet. She ordered.
created_at
"2026-04-27T08:07:11.604Z"
4/27/2026, 8:07:11 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
683
moderation_latency
(null)
generation_time
174679
tokens_prompt
77
tokens_completion
9475
native_tokens_prompt
76
native_tokens_completion
11602
native_tokens_completion_images
(null)
native_tokens_reasoning
10953
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0278752
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-37fc9a4e-4fe7-9631-af5a-eefc0b0afa58"
is_byok
false
latency
683
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777277231-lKs48rRKeycQOjCfrG3O"
session_id
(null)
api_type
"completions"
id
"gen-1777277231-KZ0VdH8y0PUKO7fq319W"
upstream_id
"chatcmpl-37fc9a4e-4fe7-9631-af5a-eefc0b0afa58"
total_cost
0.0278752
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
506
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
506
wordsDialogue
0
49.3562%