NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBGFBSX9YYG4WB3EZRK7WJS
Run Details
Model:
Qwen 3.5 27B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
59.7s
Cost:
$0.014620
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled of stale coffee and desperation. Molly adjusted the binoculars, focusing on the suburban house where balloons swayed in the breeze. Bill sat in the passenger seat, picking at his fingernails with a pocket knife. Look at that security system, Bill said, chewing a gum wrapper. It is a plastic fence. Molly lowered the glasses, her eyes narrowing at the flimsy barrier surrounding the backyard. It is a toddler fence, Bill. It is designed to keep a three year old in, not a professional thief out. Bill grinned, revealing a missing tooth he had never bothered to replace. So we climb over. Easy. But what about the loot? We cannot carry a thousand LEGO sets. Molly tapped a clipboard with a red pen, marking off contingencies. We do not take the LEGO. We take the electronics. Tablets, consoles, those smartwatches that track kids. What about the cake? Bill asked, his eyes widening. No cake. We are here for the merchandise, not the sugar rush. But the frosting is edible. Focus, Bill. The parents are distracted by the clown. Is it a real clown? Or a hired actor? Is the clown wearing a wig? Molly checked her watch. The timing was critical. Does it matter? If he is juggling, we move. It is a wig. I can see the hairline. I still think we should take the piñata. She slammed the clipboard shut. The piñata is empty, Bill. It is just a cardboard donkey. Full of candy though. Molly grabbed the crowbar from the floorboard, her knuckles white. If you touch the candy, I leave you here. Do we have a bag? The duffel is in the trunk. Fair enough. What is the exit strategy? She pointed toward the alleyway behind the house where the trash cans were lined up. Trash cans. We dump the loot in the bins and drive off. The garbage truck comes at six. Molly froze. The math was suddenly against them. Then we sell it before six. We hit the pawn shop on Elm. Pawn shops do not buy unopened Transformers. Bill opened the van door, the hinges squeaking in protest. We break them open first. Then it is scrap metal. Just get in the car. We start with the backyard. I am bringing the crowbar. They stepped out into the humid afternoon air, ready to rob a birthday.
created_at
"2026-02-25T22:59:49.186Z"
2/25/2026, 10:59:49 PM
model
"qwen/qwen3.5-27b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
387
moderation_latency
(null)
generation_time
59589
tokens_prompt
77
tokens_completion
6202
native_tokens_prompt
76
native_tokens_completion
6082
native_tokens_completion_images
(null)
native_tokens_reasoning
5547
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0146196
router
(null)
provider_responses
0
endpoint_id
"f3f88554-0c22-4d57-973b-0dec948808e8"
id
"chatcmpl-da224621-00f6-9030-b543-582b3f4f8216"
is_byok
false
latency
387
model_permaslug
"qwen/qwen3.5-27b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772060389-RRxejSMMwO2yP2esfibi"
upstream_id
"chatcmpl-da224621-00f6-9030-b543-582b3f4f8216"
total_cost
0.0146196
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
395
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
395
wordsDialogue
0
0.0000%