NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK2HDYEK7T35B4CR07Z4SFR
Run Details
Model:
Qwen 3.5 397B A17B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
5m 25s
Cost:
$0.048700
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Wait, they're six," Bill said, peering through binoculars. "Exactly. Zero security awareness," Molly replied, checking set. "They scream if you touch their balloons." "We wear clown suits. Camouflage. Perfect." "I'm not wearing a nose, Mol. It chafes." "Fine. Just the wig. And the shoes." "What about the parents? They watch everything." "Drunk on punch. Spiked apple juice. Trust me." "This feels wrong. Literally infants." "It's LEGO, Bill. Pure liquid asset. Untraceable bricks." "There's a bouncy castle. Big Hazard." "Distraction. You jump, I grab. Simple mechanics." "I hate bouncing. My knees are shot." "Think of the retirement fund. Need equity." "Stolen Play-Doh isn't a pension plan, Molly." "It's the limited edition set. Rare value." "Why are we like this?" "Because rent is due. Landlord is impatient." "Plan B? If the clowns fail." "Magic show. Saw a rabbit in a hat." "We're stealing a rabbit? That's animal trafficking." "No, we steal the gifts while they watch the rabbit." "That's clever. Disgusting, but clever." "I know. Grab the bag. Check straps." "Do we have to sing Happy Birthday? I really deaf." "Only if they ask. Stay silent." "Ready?" Bill asked, zipping up duffel. "Born ready," Molly said, pulling greasepaint. "Let's rob some toddlers."
created_at
"2026-02-16T11:19:55.886924+00:00"
2/16/2026, 11:19:55 AM
model
"qwen/qwen3.5-397b-a17b-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
787
moderation_latency
(null)
generation_time
324832
tokens_prompt
77
tokens_completion
8644
native_tokens_prompt
76
native_tokens_completion
13515
native_tokens_completion_images
(null)
native_tokens_reasoning
13164
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0486996
router
(null)
provider_responses
0
id
"chatcmpl-905b3e2c-4765-9cfe-aff5-9d1fe6b2b5a7"
status
200
is_byok
false
latency
787
endpoint_id
"180484a2-e24b-495a-ba41-b34b680d5a9f"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-397b-a17b-20260216"
api_type
"completions"
id
"gen-1771240470-Nm5uVsAEmwr0NdIcKfgR"
upstream_id
"chatcmpl-905b3e2c-4765-9cfe-aff5-9d1fe6b2b5a7"
total_cost
0.0486996
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
201
99.9881%
Dialogue to Total Word Ratio
Ratio: 91.04%, Deviation: 1.04%
neededClean
false
wordsTotal
201
wordsDialogue
183
99.9935%