NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBKM5JNE0DSCJNVXY0151EQ
Run Details
Model:
Qwen 3.5 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
57.7s
Cost:
$0.003176
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The relentless rain slicked the grimy pavement below, reflecting the faint neon glow of the neighborhood watch yard light against the wet asphalt outside the suburban mansion where the children's party was currently in full swing with laughter. Molly adjusted her synthetic brown wig, realizing it felt more like static cling wrapped securely around her scalp than actual human hair as she watched the door. She peered through the narrow vertical gap in the heavy velvet curtains, squinting hard against the festive string lights wrapping the porcelain railing like decorative electric vines that illuminated the dark backyard beyond. Inside the warm house, the sound of raucous adult laughter and the clatter of plates being stacked on the kitchen counter drifted faintly through the open windows near the back door. Bill leaned heavily against the chain link fence in the pouring rain, shivering violently in a vintage tuxedo that was clearly three sizes too big for his frame. His satin bowtie kept sliding down his chest continuously as he tried to maintain a cool criminal demeanor while checking his digital watch repeatedly in the gloom to ensure they stayed on schedule for the break in. The perimeter is soft, Molly whispered. But the guard dog is the real physical threat here. Bill asked which specific one. Is it the poodle? It is a Basset Hound named Sir Barksalot who sleeps on the leather couch in the center of the living room, she answered confidently. Bill nodded nervously, feeling the cold dampness of the rain soaking into his shoulders despite his formal attire. Molly stepped away from the window to check her wristwatch. Do not underestimate the security of a gift bag. Bill rubbed his temples, looking at the house like it was a dangerous bomb waiting to explode. What if the kids wake up during the cake cutting ceremony? Molly grabbed his lapels with a look of grave seriousness on her face. If they wake up, we tell them it is a magic party where the presents came from benevolent ghosts. Bill sighed deeply at the absurdity of their situation. Do you know how many hours I spent practicing walking on tip-toes? He stared at the dark windows intently. Focus, Bill. The clock strikes nine. That is when the parents start the cake. The sugar crash is the perfect distraction for us to grab the loot quickly. Bill asked if there was a smoke bomb if the cake explodes? Then they would flee with the loot immediately. We meet at the old clock tower in twenty minutes. No exceptions. Bill looked at the front door then at his sister's minivan parked down the street. What is the exit vehicle again? My sister's minivan. She thinks it is a catering truck for some random birthday gig. Right. I am ready to go. You better be. Molly pushed the unlocked front door ajar quietly. If we get caught, we tell the police it was for charity. Charity for who? For me. The world needs more toys. You are insane. I am efficient. Let us begin the heist now.
created_at
"2026-02-25T23:55:03.406Z"
2/25/2026, 11:55:03 PM
model
"qwen/qwen3.5-flash-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
449
moderation_latency
(null)
generation_time
46546
tokens_prompt
77
tokens_completion
8494
native_tokens_prompt
76
native_tokens_completion
7920
native_tokens_completion_images
(null)
native_tokens_reasoning
7308
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0031756
router
(null)
provider_responses
0
endpoint_id
"5f6b71b5-8672-4b27-8b2d-f674429f9e63"
id
"chatcmpl-680b4b0d-2bcb-969b-a8a0-674e9815b88c"
is_byok
false
latency
449
model_permaslug
"qwen/qwen3.5-flash-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772063703-ZiAhRF5tE7D2iTRvYjVj"
upstream_id
"chatcmpl-680b4b0d-2bcb-969b-a8a0-674e9815b88c"
total_cost
0.0031756
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
515
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
516
wordsDialogue
0
30.1376%