NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK1A5SNRCMYAMQXNHS4GM8S
Run Details
Model:
Qwen 3.5 Plus (2026-02-15)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
13.1s
Cost:
$0.000701
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her neon clown wig, the synthetic fibers itching violently against her scalp as she peered through the binoculars at the suburban fortress. The target was a six-year-old's birthday bash, a chaotic nexus of sugar-rushed toddlers and exhausted parents guarding a mountain of wrapped treasures. Bill stood beside her, struggling to zip up his oversized inflatable T-Rex costume, the plastic tail dragging uselessly in the damp grass. They needed a distraction capable of clearing the room without alerting the neighbors or triggering any latent parental rage. The sheer volume of glittery wrapping paper represented a fortune in loose change and cheap plastic gadgets, a heist score that demanded precision despite its absurd nature. Molly signaled for silence as a group of children began singing off-key near the patio doors. The air smelled faintly of burnt hot dogs and desperation. Bill nodded solemnly, his dinosaur head bobbing with tragic incompetence. They moved toward the side gate, their movements exaggerated and clumsy within the restrictive costumes. The plan relied entirely on the assumption that no adult would question a crying dinosaur or a weeping clown during a cake emergency. It was a gamble born of sheer idiocy. Molly whispered the final instruction while checking her bag of empty pillowcases. Bill gave a thumbs-up that looked more like a claw swipe. They lunged forward into the fray, ready to pillage the toy hoard.
created_at
"2026-02-16T10:53:18.046296+00:00"
2/16/2026, 10:53:18 AM
model
"qwen/qwen3.5-plus-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
828
moderation_latency
(null)
generation_time
13031
tokens_prompt
77
tokens_completion
365
native_tokens_prompt
78
native_tokens_completion
279
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0007008
router
(null)
provider_responses
0
id
"chatcmpl-7242f1c8-cc93-9bac-9896-1c82821facc6"
status
200
is_byok
false
latency
828
endpoint_id
"71e0f6c3-7815-4734-92d3-8a04f3b0ca91"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-plus-20260216"
api_type
"completions"
id
"gen-1771239184-Y1n3HfqJeoMoZN6u2jJF"
upstream_id
"chatcmpl-7242f1c8-cc93-9bac-9896-1c82821facc6"
total_cost
0.0007008
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
236
36.7879%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 10.00%
neededClean
false
wordsTotal
236
wordsDialogue
0
18.3940%