NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK1SS0WK3YAPQ7R34D77GR4
Run Details
Model:
Qwen 3.5 397B A17B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 13s
Cost:
$0.016591
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the blueprints spread across the dusty dashboard. Bill chewed on a fingernail, staring nervously at the colorful venue across the street. "So the primary target is definitely the PiƱata Room," Molly said. "It's a birthday party, Mol. For a four-year-old named Timothy," Bill replied. "High value loot regardless of the host. Lego sets, cash envelopes, unused tablets. Do you know what a Switch costs on the black market?" "It costs conscience points. We're robbing toddlers who believe in magic." "They won't remember a thing. Trauma repression is real. Focus on the exit strategy." Bill sighed heavily and leaned back against the worn vinyl seat. He looked out the window at the inflatable bouncing castle. "The security is intense. There is a clown guarding the door." "Clowns are unpredictable. I will handle the clown. You grab the gift table." "What about the parents? They have eyes everywhere." "Distraction protocol. I release the helium balloons during the cake cutting. Chaos ensues." "You want to weaponize party favors?" "It is classic misdirection. Now, check the gear." Bill opened the duffel bag and pulled out a balaclava knitted with pastel yarn. "Why is this pink?" "Camouflage. We blend in with the decor." "I look like a Teletubby with commitment issues." "Stop complaining. Put on the mask. We move in ten." "Wait, what if there is a magician?" "Magicians are amateurs. Sleight of hand versus actual theft. We win." "I don't feel good about this. The ethics are skewed." "The ethics pay the rent. Remember the baptism last month?" "That was silverware. This is Play-Doh and dreams." "Dreams don't pay for your bail. Grab the sack." Bill lifted the large canvas bag. It had a duck printed on the side. "This says quack." "It's ironic. Less suspicious than a gunny sack." "If I go to jail, I'm telling the judge you made me do it." "Standard procedure. Now move. The entertainer is starting the bubble show." "Bubbles obscure vision. That helps us." "Exactly. You're getting the hang of it." "I still think we should rob a bank." "Banks have alarms. Kids have sugar highs. Which is more dangerous?" "Sugar highs cause hysteria. Alarms cause police." "Exactly. You're getting the hang of it." Bill paused. He watched a child scream over a dropped ice cream. "The kids." "Trust me. In and out. No crying babies." "That's the only rule?" "That's the only rule. Let's go make some noise." "Are we ready?" "No." "Too bad. Do we have the getaway car?" "It's a minivan." "Perfect." Bill stepped out of the van. He adjusted the pink mask. Molly checked her watch. "Showtime," she whispered.
created_at
"2026-02-16T11:03:48.57727+00:00"
2/16/2026, 11:03:48 AM
model
"qwen/qwen3.5-397b-a17b-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
779
moderation_latency
(null)
generation_time
132793
tokens_prompt
77
tokens_completion
4879
native_tokens_prompt
76
native_tokens_completion
4596
native_tokens_completion_images
(null)
native_tokens_reasoning
3939
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0165912
router
(null)
provider_responses
0
id
"chatcmpl-233547b0-2cf5-9a11-88db-89c7b5caae5f"
status
200
is_byok
false
latency
779
endpoint_id
"180484a2-e24b-495a-ba41-b34b680d5a9f"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-397b-a17b-20260216"
api_type
"completions"
id
"gen-1771239695-Jmbpenv2FzfdWWsDn0Gf"
upstream_id
"chatcmpl-233547b0-2cf5-9a11-88db-89c7b5caae5f"
total_cost
0.0165912
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
441
14.7180%
Dialogue to Total Word Ratio
Ratio: 76.19%, Deviation: 6.19%
neededClean
false
wordsTotal
441
wordsDialogue
336
7.3590%