NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK1FXD4FH8MHASNBW4XNR9D
Run Details
Model:
Qwen 3.5 397B A17B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4m 33s
Cost:
$0.039584
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the night-vision goggles, squinting at the bouncy castle inflated in the driveway. It looked like a purple monster ready to swallow them whole. Bill sat beside her, crunching loudly on a carrot stick. The van smelled of stale coffee and desperation. They were professional criminals, yet here they were, targeting a six-year-old's birthday bash. The intelligence suggested high-value loot inside the gift tent. LEGO sets and cash envelopes waited within the flimsy white canvas. Street lights flickered above the silent suburban neighborhood. No one was watching. Shadows stretched long across the manicured grass. A tricycle lay abandoned near the garage. The silence was heavy, broken only by the distant hum of a lawnmower. Bill swallowed hard. Are we really doing this? It feels wrong targeting kids. My cousin has a kid. What if it is him? Molly didn't look away. The target is vulnerable. Parents are distracted by punch. Security is non-existent unless you count a golden retriever named Sparky. We need the money, Bill. The rent is due on Tuesday. Besides, the clown is our biggest threat. Bill nodded slowly. It was absurd, but it was their best shot. He picked up the blueprint drawn on a stained napkin. The entry point was through the plastic ball pit. It offered cover and a soft landing. Molly checked her watch. The cake cutting was scheduled for seven PM. That was when the chaos peaked. Everyone would be singing. That was their window. The ink was smudged, but the path was clear. Lines indicated guard patrols, though none existed. Arrows pointed to the treasure zone. It was a simplistic plan for a complex job. They had no backup. Bill sighed. I feel dirty stealing from toddlers. What if they cry? I cannot handle crying children. It breaks my heart. Can we leave the stuffed animals? Molly snapped the goggles down. Her expression was dead serious. We are redistributing wealth, Bill. Those kids have too many toys. We are helping them learn about loss early. It is a valuable life lesson. Stop being soft. Grab the sacks. We take everything. Bill looked at the burlap sacks. He worried about the ethics, but the rent was due. He checked his utility belt. It held zip ties and gummy bears. It was the toolkit of a madman. Molly kicked open the van door. The humid night air hit them. They moved silently across the lawn, ducking behind garden gnomes. The house glowed. They reached the ball pit. Molly dove in first. Bill hesitated. He heard the dog bark. He jumped. The balls closed over his head. Plastic spheres pressed against his face. He was blind and trapped. Molly grabbed his ankle and pulled. They were in. The gift tent awaited. Bill whispered. Do you hear that? It sounds like running. Cold plastic dug into his skin. He struggled to breathe in the confined space. Darkness was absolute. He trusted Molly implicitly. She dragged him forward. The canvas flap lifted. Light spilled in. Molly froze. It was the sound of tiny feet running toward them. We are compromised. Run.
created_at
"2026-02-16T11:00:48.403917+00:00"
2/16/2026, 11:00:48 AM
model
"qwen/qwen3.5-397b-a17b-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
839
moderation_latency
(null)
generation_time
272528
tokens_prompt
77
tokens_completion
9893
native_tokens_prompt
76
native_tokens_completion
10983
native_tokens_completion_images
(null)
native_tokens_reasoning
10316
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0395844
router
(null)
provider_responses
0
id
"chatcmpl-51635acf-b047-99ff-88d1-fb6d9c410cd0"
status
200
is_byok
false
latency
839
endpoint_id
"180484a2-e24b-495a-ba41-b34b680d5a9f"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-397b-a17b-20260216"
api_type
"completions"
id
"gen-1771239372-o4ig04apvnxDzrwZxywt"
upstream_id
"chatcmpl-51635acf-b047-99ff-88d1-fb6d9c410cd0"
total_cost
0.0395844
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
14.3014%
Matches word count
n/a
neededClean
false
words
521
0.0000%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
521
wordsDialogue
0
7.1507%