NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK1TKQ70CNZ4FC2VE5PX48C
Run Details
Model:
Qwen 3.5 Plus (2026-02-15)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
14.2s
Cost:
$0.000840
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Bill, stop shaking. We are hitting a six-year-old's birthday, not Fort Knox." "Molly, have you seen the security? It's a clown. A literal clown with a balloon sword." "That clown is ex-military. I checked his LinkedIn. Now, the plan: you distract the magician while I hit the gift table." "Distract how? Do I ask him to pull a rabbit out of my hat? I don't have a hat!" "Just trip over the bouncy castle. Make it look tragic." "And the loot? Are we taking the Lego sets or just the cash envelopes?" "Everything, Bill. Even the half-eaten cake pops. Black market value is insane right now." "What if the kids scream? I can't handle screaming kids, Molly. They smell like sugar and judgment." "Then wear the noise-canceling headphones we stole from the tech store last week." "But then I won't hear you signal the getaway!" "I'll flash the emergency strobe light on my watch. Twice for go, once for abort." "Why not just yell 'run'?" "Because subtlety is our brand, Bill! Now, put on the inflatable T-Rex costume. It's the perfect disguise." "I look ridiculous. I'm a professional thief in a dinosaur suit." "You're a professional thief about to be rich in a dinosaur suit. Move out, Rex. The piƱata awaits." "Fine. But if I get stuck in the slide, you're coming back for me." "Deal. Now let's go steal some toy cars."
created_at
"2026-02-16T11:02:17.582331+00:00"
2/16/2026, 11:02:17 AM
model
"qwen/qwen3.5-plus-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
799
moderation_latency
(null)
generation_time
14138
tokens_prompt
77
tokens_completion
342
native_tokens_prompt
78
native_tokens_completion
337
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00084
router
(null)
provider_responses
0
id
"chatcmpl-ab3fc41e-5002-9f0b-b064-da0a7e86bdb7"
status
200
is_byok
false
latency
799
endpoint_id
"71e0f6c3-7815-4734-92d3-8a04f3b0ca91"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-plus-20260216"
api_type
"completions"
id
"gen-1771239722-77A6PxAlj0ngTiEdCS0V"
upstream_id
"chatcmpl-ab3fc41e-5002-9f0b-b064-da0a7e86bdb7"
total_cost
0.00084
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
237
36.7879%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
237
wordsDialogue
237
18.3940%