NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6YZXMPTDWC6Q7EMDBZ337T
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
3m 29s
Cost:
$0.033350
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.6885%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The damp garage smelled of old motor oil. Molly unrolled a crisp blueprint on a wobbly table. Bill adjusted his cargo pants and examined a plastic mask. The operation required absolute precision. The team ignored childhood joy. They shared a quiet nod. Two professionals stood ready. The heist began now. We enter when the bouncy castle inflates, Molly said, tracing the fence line. Parents will be watching the toddlers. The gifts sit exposed near the kitchen doors. Bill nodded slowly. The cake cuts happen at two. That gives us ninety quiet minutes. We avoid the lemonade stand. It is a hazard. She smoothed the paper and tapped the grill coordinates with a dry marker. The host father loved smoking ribs and hated unsolicited advice. Bill leaned closer, his shadow stretching across the routes. They had rehearsed this sequence exactly three times. Mistakes would ruin everything. Perfect execution was mandatory. Second chances vanished. We distract him, she replied, pointing at the backyard barbecue. You will drag the red wagon through the gate. Keep it low. Keep it quiet. He frowned at the diagram. The wagon squeaks. The axle failed. Bring the wrench. Forget the noise. We use lubricant. We use patience. We use panic. Panic leaves fingerprints and raises questions among suburban neighbors. Molly packed the blueprints into a waterproof satchel. She adjusted her wire frame glasses. Bill tested the zipper on a heavy canvas duffel twice. The afternoon sun filtered through dusty windows. Light painted shapes on the concrete floor. Time was ticking. What about the piƱata, he asked, eyes suddenly lighting like Christmas bulbs. There is always candy inside. Could we siphon some off the top for morale. She slammed her marker down. The children swing wooden spoons with terrifying accuracy. They possess zero regard for collateral damage. Stick to the table. Quickly. The table held the real assets and minimal risk factors. Bill crossed his arms and stared at the overhead bulb. He nodded slowly, accepting her cold professional logic. The heavy canvas duffel sat ready by the workbench. Every tool was accounted for. Every escape route memorized. They stood in silence. We abandon the site immediately, she replied, rolling the cylinder tight. We never run. We simply vanish. Like smoke. Like bad dreams. He zipped the final compartment and checked the latch. I prefer running. Running feels faster. She shook her head and tucked the marker away. Running leaves traces. Always. Traces led back to them, and work mattered more than sleep. The garage door rattled open, admitting pale afternoon light. Bill slung the heavy duffel over his shoulder. Molly grabbed the blueprints and marched toward the exit. Two seasoned professionals prepared to rob a childhood celebration. They moved. Without hesitation. Ready, Molly whispered, stepping onto the cracked driveway. Bill cracked his neck and adjusted his bow tie. Let us steal the magic. He grinned, revealing a chipped incisor. The neighborhood sleeps soundly tonight. We leave before sunrise. Just grab what fits. Do not linger. Do not feel guilt. We are professionals.
created_at
"2026-04-27T07:55:48.895Z"
4/27/2026, 7:55:48 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
870
moderation_latency
(null)
generation_time
209110
tokens_prompt
77
tokens_completion
13386
native_tokens_prompt
76
native_tokens_completion
13883
native_tokens_completion_images
(null)
native_tokens_reasoning
13227
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0333496
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-0490f8f3-f278-9012-a38a-81a6c62b2a85"
is_byok
false
latency
870
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777276548-hAR4Oz9IFSlO4gTUYpev"
session_id
(null)
api_type
"completions"
id
"gen-1777276548-qmeRVdHSQaY8kMC3eVY2"
upstream_id
"chatcmpl-0490f8f3-f278-9012-a38a-81a6c62b2a85"
total_cost
0.0333496
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
505
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
505
wordsDialogue
0
49.6885%