NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK1Z01W63Y7498A2A0Z771Q
Run Details
Model:
Qwen 3.5 Plus (2026-02-15)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
12.6s
Cost:
$0.000785
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her tiny beret, peering through the hedge at the bouncing castle. "This is it, Bill. The ultimate score." Bill squinted at the inflatable dinosaur. "We are robbing a six-year-old's birthday? Really?" "Think of the loot," she whispered fiercely. "LEGO sets, remote control cars, enough sugar to power a small city." "It feels wrong, Mol. They have trust issues already." "Stop being soft. We go in when the clown distracts them with balloons." "The clown looks depressed. He might cry on us." "Then we tackle him. Do you want the glittery unicorn or not?" "I suppose the unicorn does have resale value." "Exactly. We slip in during 'Happy Birthday,' grab the pile, and vanish before cake cutting." "What if a parent sees us?" "Tell them we are the new magicians. Everyone loves a surprise act." "You make felony grand larceny sound like entertainment." "That is the genius of it. Now, check your pocket for the giant socks we will wear as gloves." "These are ridiculous. I look like a foot." "Perfect camouflage. Nobody suspects a giant foot. Let's move before the piƱata drops." "I still think we should just buy a gift instead." "Quiet, Bill. The mission starts now. Remember, no taking the cake. Only the treasures." "Fine, but if I get sticky, I am blaming you."
created_at
"2026-02-16T11:04:39.554305+00:00"
2/16/2026, 11:04:39 AM
model
"qwen/qwen3.5-plus-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
878
moderation_latency
(null)
generation_time
12541
tokens_prompt
77
tokens_completion
351
native_tokens_prompt
78
native_tokens_completion
314
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0007848
router
(null)
provider_responses
0
id
"chatcmpl-4eef1e9e-d26f-98a6-a7ed-4ff7854d9535"
status
200
is_byok
false
latency
878
endpoint_id
"71e0f6c3-7815-4734-92d3-8a04f3b0ca91"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-plus-20260216"
api_type
"completions"
id
"gen-1771239866-hKykHuBaJxIPmkKyzZcz"
upstream_id
"chatcmpl-4eef1e9e-d26f-98a6-a7ed-4ff7854d9535"
total_cost
0.0007848
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
20.1897%
Matches word count
n/a
neededClean
false
words
220
0.0000%
Dialogue to Total Word Ratio
Ratio: 90.00%, Deviation: 40.00%
neededClean
false
wordsTotal
220
wordsDialogue
198
10.0948%