NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ7003RM12VY1DQX1HG5F8BC
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 31s
Cost:
$0.014346
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the folded birthday party schedule across the rusted folding table. The ceiling fan wobbled overhead, casting long shadows across the cracked tiles. She tapped a manicured nail on the printed timetable. Bill leaned against the doorframe, spinning a stolen party popper between his fingers. He cleared his throat. So tell me we are not robbing a seven year old. Correct. Molly did not look up. She smoothed the schedule flat and checked her watch. The security is laughable. Eight parents, a rotating piñata watch, and a strictly enforced no running policy. We walk in, we swap the good gifts for foam replicas, we walk out. Bill set the popper down. And the clown. You definitely mentioned the clown. The clown is a distraction. He will be juggling during the gift opening phase. That gives us a four minute window. Four minutes. Bill paced the scuffed linoleum. To replace twelve wrapped boxes without triggering parental guilt. Do you know how heavy those things are. Molly, I grew up with a stepfather who used a meat thermometer as a lie detector. I do not do four minute windows. You will adjust. Molly slid a cardboard cutout across the table. Here is the blueprint. We enter through the side gate. The homeowner uses a digital lock but the code is the child birth year repeated twice. That is insulting. Bill snatched the cutout. We are professionals. We stole a painting from a museum guarded by lasers and a very large German shepherd. And yet here you are. Molly tapped her pen against her clipboard. Complaining about a suburban fence. Are we doing this or are you going back to stealing garden gnomes. Bill sighed. The gnomes were a steady income stream. Fine. What about the cake. The cake is irrelevant. We are after the presents. Specifically the video game console and the high end building set. What about the goodie bags. Leave the goodie bags. They are full of sticky fruit snacks and cheap plastic toys. We are thieves, Bill, not monsters. Fair point. Bill adjusted his fake party hat. When do we strike. Saturday. Three PM. The parents will be distracted by small talk and overpriced beverages. We move fast. We move quiet. We move before the pizza arrives. You handle the gate. I will handle the swap. And after. We vanish into the neighborhood. Blend in as overly enthusiastic aunt. You will wear the sweater. Bill groaned. I hate the sweater. It itches. You will wear the sweater. Molly finally looked up. Unless you want to explain to the crew why we are eating ramen for the next three months. I will wear the sweater. Bill picked up the party popper again. This is going to work. I can feel it. Unless the child has a guard dog. The child has a hamster named Sir Whiskers. He is trained to run on a wheel. He will do absolutely nothing. Excellent. Bill snapped a salute. I will start packing the foam boxes. And Molly. What. I am keeping one party popper. For tactical use. Molly closed her folder. Just keep your hands off the cake. No promises. Bill turned toward the door. But I will aim for the presents. Molly nodded once. Saturday. Be ready. Bill disappeared into the hallway. I will be ready. I will be a whirlwind. Molly watched the empty doorway. Whirlwind. She muttered to the buzzing light. Just do not trip over the piñata.
created_at
"2026-04-27T08:13:23.611Z"
4/27/2026, 8:13:23 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
773
moderation_latency
(null)
generation_time
90530
tokens_prompt
77
tokens_completion
6067
native_tokens_prompt
76
native_tokens_completion
5965
native_tokens_completion_images
(null)
native_tokens_reasoning
5203
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0143464
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-03cf9a30-3ea8-9eb5-afb7-3d4eb4b3bf11"
is_byok
false
latency
773
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777277603-q4DoAiNVA3snfB0LK2l9"
session_id
(null)
api_type
"completions"
id
"gen-1777277603-gGkZ4NT2n1zj2tQh7eqT"
upstream_id
"chatcmpl-03cf9a30-3ea8-9eb5-afb7-3d4eb4b3bf11"
total_cost
0.0143464
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
578
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
578
wordsDialogue
0
0.0000%