NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK2MA3TD1RW487DCTBHYJCN
Run Details
Model:
Qwen 3.5 Plus (2026-02-15)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
11.6s
Cost:
$0.000658
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her neon pink wig, the synthetic fibers itching fiercely against her scalp as she peered through the hedge. The backyard buzzed with the chaotic energy of six-year-olds screaming over musical chairs. Bill crouched beside her, struggling to fit his broad shoulders into a oversized clown suit that smelled faintly of old balloons and despair. He held a burlap sack decorated with happy faces, which looked entirely too suspicious for anyone over the age of ten. "We go now," Molly whispered sharply. Bill nodded, his red foam nose wobbling with the motion. They scrambled low across the manicured lawn, dodging a stray soccer ball and a parent texting on a bench. The table groaned under the weight of wrapped boxes, shimmering in the afternoon sun like a dragon's hoard made of cardboard and tape. Molly began shoving Lego sets and plush dinosaurs into the sack with frantic efficiency. Bill grabbed a stack of board games, his movements clumsy in the floppy shoes. A sudden silence fell over the party as the music stopped. Both thieves froze, hearts hammering against their ribs like trapped birds. The clown suit felt suddenly suffocating. Molly tightened her grip on a bag of marbles, eyes darting toward the kitchen door where shadows lengthened ominously.
created_at
"2026-02-16T11:16:16.961179+00:00"
2/16/2026, 11:16:16 AM
model
"qwen/qwen3.5-plus-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
816
moderation_latency
(null)
generation_time
11550
tokens_prompt
77
tokens_completion
323
native_tokens_prompt
78
native_tokens_completion
261
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0006576
router
(null)
provider_responses
0
id
"chatcmpl-7a7f790b-7a57-91ef-ba87-858da6ebb526"
status
200
is_byok
false
latency
816
endpoint_id
"71e0f6c3-7815-4734-92d3-8a04f3b0ca91"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-plus-20260216"
api_type
"completions"
id
"gen-1771240564-54p8QR9mWwf4p7W3aXWN"
upstream_id
"chatcmpl-7a7f790b-7a57-91ef-ba87-858da6ebb526"
total_cost
0.0006576
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
81.2727%
Matches word count
n/a
neededClean
false
words
212
58.0899%
Dialogue to Total Word Ratio
Ratio: 1.42%, Deviation: 8.58%
neededClean
false
wordsTotal
212
wordsDialogue
3
69.6813%