NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK2ER3FYVZ3K9DGRK6Y7E06
Run Details
Model:
Qwen 3.5 397B A17B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
3m 31s
Cost:
$0.028979
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly lowered the binoculars and turned to face Bill in the cramped surveillance van. The interior smelled distinctly of stale coffee and poor life choices made over the last decade. "The target is secured and completely vulnerable," Molly said. "It is literally a backyard with crepe paper streamers," Bill replied. "It is a vault of unwrapped potential and high retail value," Molly said. Bill sighed heavily and shifted his weight in the worn passenger seat. He adjusted his tie. "They are six-year-olds, Molly. They cry when you take their crayons." "They cry when you take their iPads too. That is exactly why we wear masks," Molly said. "We are wearing red plastic clown masks. That seems excessive for a suburban theft," Bill said. "It is camouflage. Nobody notices a clown at a party until it is far too late," Molly said. Bill looked at the grinning plastic face resting on the dirty dashboard. He touched the nose. "What about the parents? They have eyes everywhere and smartphones," Bill said. "Parents are distracted by cheap wine and pretending to care about magic shows," Molly said. "There is a pony out there in the grass," Bill said. "Excellent. Distraction acquired and verified by intel," Molly said. "The pony might kick us in the shins hard," Bill said. "Then we sedate the pony. The budget allows for tranquilizers," Molly said. Bill rubbed his temples slowly. He really needed a new partner for these jobs. Molly checked her watch. The timing had to be perfect. "What is the haul estimate looking like today?" Bill asked. "Legos, tablets, cash envelopes, and a disproportionate amount of novelty socks," Molly said. "Socks?" Bill asked. "High thread count. Black market value is surprisingly high," Molly said. "I feel physically sick about this whole plan," Bill said. "Focus on the prize. Think of the respiratory masks," Molly said. "Those are specifically for kids with asthma," Bill said. "Everyone needs protection in this economy," Molly said. Bill looked out the window at the colorful streamers flapping in the wind. "There is a magician over by the cake table," Bill said. "Threat level?" Molly asked. "He pulls rabbits out of hats. Could be concealed weapons," Bill said. "We move during the cake cutting. Chaos is our greatest ally," Molly said. "Singing makes people emotional. They will not notice the theft," Molly added. "What if a child sees us stealing the goods?" Bill asked. "Tell them we are the entertainment. The mime duo," Molly said. "I cannot mime for anything," Bill said. "Then scream silently. It sells the bit," Molly said. Bill picked up the clown mask and stared into the hollow painted eyes. Molly tightened her gloves. "We are bad people," Bill said. "We are professionals targeting a high-value demographic," Molly said. "They still believe in the tooth fairy," Bill said. "And we believe in liquidity," Molly said. Bill checked the lock on the van door. A balloon popped somewhere in the distance loudly. The engine hummed quietly as the song Happy Birthday drifted over the fence. "Is the car running?" Bill asked. "It is idling. We leave soon," Molly said. "What if cake is delayed?" Bill asked. "Then we eat the cake," Molly said. "Ready the bags," Molly said. "I want the Lego Batman set personally," Bill said. "Get in line," Molly said.
created_at
"2026-02-16T11:16:33.877803+00:00"
2/16/2026, 11:16:33 AM
model
"qwen/qwen3.5-397b-a17b-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
826
moderation_latency
(null)
generation_time
210674
tokens_prompt
77
tokens_completion
8029
native_tokens_prompt
76
native_tokens_completion
8037
native_tokens_completion_images
(null)
native_tokens_reasoning
7271
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0289788
router
(null)
provider_responses
0
id
"chatcmpl-23dbedbc-1bdc-9238-9db0-5f57b372d25f"
status
200
is_byok
false
latency
826
endpoint_id
"180484a2-e24b-495a-ba41-b34b680d5a9f"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-397b-a17b-20260216"
api_type
"completions"
id
"gen-1771240382-8XaDtw60X3ynV75jnYku"
upstream_id
"chatcmpl-23dbedbc-1bdc-9238-9db0-5f57b372d25f"
total_cost
0.0289788
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
556
0.4175%
Dialogue to Total Word Ratio
Ratio: 59.53%, Deviation: 10.47%
neededClean
false
wordsTotal
556
wordsDialogue
331
0.2088%