NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK30ZGMBHSXNDQGNSZMD883
Run Details
Model:
Qwen 3.5 397B A17B
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
5m 31s
Cost:
$0.044038
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the inflatable bouncy castle, adjusting her night vision goggles while panting. The venue hummed with energy of six year olds sugar rushing on cheap cake. Bill checked his watch, wiping sweat from his forehead with a gloved hand. This was their biggest job, though the demographic felt morally questionable to professionals. He nodded toward the gift table, stacked high with wrapped boxes promising Lego sets and cash envelopes in paper. "Are we really stealing from toddlers?" Bill asked quietly. "It feels wrong." Molly squinted through the window. A painted menace juggled balloons near the loot. They needed a distraction immediately. Bill pulled out synthetic dog treats, planning to unleash chaos among parents' pets outside. Molly tightened her grip on the lockpick set. The security was lax, mostly terrified mothers watching toddlers run wild. She signaled the go ahead. Bill moved like a shadow, creeping toward the side entrance. The moral weight of stealing from children sat heavy, but rent was due. They slipped inside, ready to swap presents for bricks of soap. The mission was absurd, yet necessary for survival. "Just take the Lego," Molly said firmly to her partner. "Leave the cash for the parents alone."
created_at
"2026-02-16T11:28:31.222228+00:00"
2/16/2026, 11:28:31 AM
model
"qwen/qwen3.5-397b-a17b-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
834
moderation_latency
(null)
generation_time
331084
tokens_prompt
77
tokens_completion
9675
native_tokens_prompt
76
native_tokens_completion
12220
native_tokens_completion_images
(null)
native_tokens_reasoning
11964
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0440376
router
(null)
provider_responses
0
id
"chatcmpl-e8852740-a46d-9097-a8f1-d51b383de684"
status
200
is_byok
false
latency
834
endpoint_id
"180484a2-e24b-495a-ba41-b34b680d5a9f"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-397b-a17b-20260216"
api_type
"completions"
id
"gen-1771240979-omKeWBR3AdtgrVdXv62B"
upstream_id
"chatcmpl-e8852740-a46d-9097-a8f1-d51b383de684"
total_cost
0.0440376
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
100.0000%
Matches word count
n/a
neededClean
false
words
200
99.9494%
Dialogue to Total Word Ratio
Ratio: 11.50%, Deviation: 1.50%
neededClean
false
wordsTotal
200
wordsDialogue
23
99.9747%