NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK18E14S63EWTGKY5P6BD2D
Run Details
Model:
Qwen 3.5 397B A17B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 60s
Cost:
$0.016739
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly peered through the binoculars at the suburban house illuminated by colorful string lights. She lowered the glasses and tapped the dashboard. Is the perimeter clear? Bill asked. It is worse than clear. It is a trap. There are bounce houses everywhere. Bounce houses are good cover. You jump over them easily. You cannot jump over an inflatable castle, Bill. It is physics. Bill unfolded a napkin with a floor plan drawn in crayon. The security system is basic. Motion sensors on the windows only. What about the guardians? Molly asked. Two adults. One is wearing a hat that says Best Dad. He looks tired. Tired means sloppy. We go in at cake time. Everyone sings. Distraction is maximum. We are stealing from children. She wiped grease from her cheek. We are redistributing wealth. Those Lego sets are taxed heavily at the store. They are wrapped in paper with dinosaurs on it. You cannot fence dinosaur paper. We take the boxes only. We leave the wrapping. That doubles our exit time. What about the big items? Bill asked. No bikes. Too loud. We take the handheld consoles. High value, low volume. Exactly. Risk versus reward. Did you bring the tranquilizers? Molly opened the glove box and revealed a water pistol. Only for the parents. What about the kids? We tell them we are the entertainment. The Magician and his Assistant. I am not wearing a top hat. You wear the vest. I wear the hat. This is insane. We could hit a bank. Banks have lasers. Kids have juice boxes. Bill leaned forward. Think about the logistics. No alarms trigger until morning when the parents see the empty space under the tree. By then we are in Mexico. Exactly. We sell the consoles in Cancun. Molly sighed and checked her watch. What is the signal? When the clown blows the whistle. That is the signal to run. No, that is the signal to engage. I am engaging my legs in the opposite direction. She grabbed the door handle. Wait. I forgot the bag. What bag? The sack. Santa style. It is thematic. You want to break into a house dressed as Santa to rob a seven-year-old? It is psychological warfare. They expect gifts. They get emptiness. Molly stared at him. You are a monster. I am a professional. Hand me the sack. Molly tossed a burlap bag from the back seat. It smelled like potatoes. It smelled like potential. He opened the van door. Ready? Molly checked the water pistol again. If I see a pony, I am leaving. No ponies on the list. Good. Let us go steal some toys. Bill stepped out into the night. Do not shoot the clown. Only if he shoots first. He has a balloon animal sword. That is a weapon. Just get the loot. Molly stepped out beside him. I hate children. They love you. You have a warm face. My face is cold. Like my heart. Then why did you bring the water pistol? For hydration. Bill laughed. Let us move. They crept toward the lawn. Wait, Bill whispered. What now? I hear music. It is Baby Shark. God help us. Move faster. They reached the hedge. Ready to breach? Ready. Go.
created_at
"2026-02-16T10:54:07.111678+00:00"
2/16/2026, 10:54:07 AM
model
"qwen/qwen3.5-397b-a17b-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
802
moderation_latency
(null)
generation_time
119537
tokens_prompt
77
tokens_completion
5064
native_tokens_prompt
76
native_tokens_completion
4637
native_tokens_completion_images
(null)
native_tokens_reasoning
3877
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0167388
router
(null)
provider_responses
0
id
"chatcmpl-19ed9a6b-0357-9ead-923f-b4da3459fb3b"
status
200
is_byok
false
latency
802
endpoint_id
"180484a2-e24b-495a-ba41-b34b680d5a9f"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-397b-a17b-20260216"
api_type
"completions"
id
"gen-1771239127-zP1r5BrTXtcd8xymeHAV"
upstream_id
"chatcmpl-19ed9a6b-0357-9ead-923f-b4da3459fb3b"
total_cost
0.0167388
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
542
0.0000%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
542
wordsDialogue
0
0.0000%