NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK30PKFMSVNM8BZZHQAMK4Y
Run Details
Model:
Qwen 3.5 Plus (2026-02-15)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
12.5s
Cost:
$0.000842
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Bill, are you sure this is the right house?" Molly whispered, peering over the fence. "It has a bounce castle." "It's the target, Mols. Look at that mountain of wrapping paper. Pure gold," Bill replied, adjusting his ski mask. "Gold? It's probably just socks and Lego sets. We'll be rich in plastic bricks." "Don't be cynical. Those limited edition action figures fetch a fortune on the dark web. Now, what's the entry plan?" "We wait for the clown to leave, then slip in through the doggy door." "A doggy door? Really? I'm a professional cat burglar, not a golden retriever." "Desperate times, Bill. Besides, the security system is laughable. Just a motion sensor near the piƱata." "What if the kids wake up from their sugar crash?" "They won't. We move during the cake cutting. Maximum distraction. Who looks at the presents when there's frosting involved?" "Good point. But if I get stuck in that tiny door, I'm blaming you." "Just suck in your gut. Hey, do you think they have those mini quiches?" "Focus, Bill! We are here for the loot, not the appetizers." "Fine. But if I see a quiche, it's coming with us." "Deal. Now, on my signal. Three, two, one... let's steal some birthdays." "Wait, did you bring the sack?" "I brought the reusable shopping bag. It's eco-friendly crime." "Perfect. Let's go."
created_at
"2026-02-16T11:23:04.357503+00:00"
2/16/2026, 11:23:04 AM
model
"qwen/qwen3.5-plus-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
873
moderation_latency
(null)
generation_time
12497
tokens_prompt
77
tokens_completion
340
native_tokens_prompt
78
native_tokens_completion
338
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0008424
router
(null)
provider_responses
0
id
"chatcmpl-4e4099b1-4cb7-98c4-8886-c606e936cc4d"
status
200
is_byok
false
latency
873
endpoint_id
"71e0f6c3-7815-4734-92d3-8a04f3b0ca91"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-plus-20260216"
api_type
"completions"
id
"gen-1771240970-sP2yfv8MROMtpLKSuTw7"
upstream_id
"chatcmpl-4e4099b1-4cb7-98c4-8886-c606e936cc4d"
total_cost
0.0008424
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
1.0360%
Matches word count
n/a
neededClean
false
words
226
95.2759%
Dialogue to Total Word Ratio
Ratio: 94.69%, Deviation: 4.69%
neededClean
false
wordsTotal
226
wordsDialogue
214
48.1559%