NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBE96AXR9NWJCMXXEZPD8AT
Run Details
Model:
Qwen 3.5 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 13s
Cost:
$0.003455
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
35.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
17.5011%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The hedges of the Sterling residence rustled violently under the weight of Bill’s nervous elbow, digging deeply into the thick mossy leaves that covered the secure perimeter of the wealthy estate where everyone inside was sleeping soundly and peacefully in their own beds. He adjusted his tuxedo tights, a garment chosen strictly for the stealth factor, despite the suffocating July heat pressing against the suburban lawn where the air smelled heavily of freshly cut grass mixed with expensive vanilla bean ice cream that drifted constantly from the open kitchen window above. Molly crouched beside him in the shadows, her high-powered optical lenses trained on the open garage door where a mountain of wrapped boxes sat like sleeping dragons ready to bite with sharp paper bows that could truly slice their skin dangerously. The aroma drifted from the open windows above the living room signaling that the main course was ready for consumption in the formal dining area with the family. They were in position for the Great Toy Heist, a plan so audacious that only a pair of career criminals with absolutely no other life options would attempt it during a standard birthday party for a very young child in the neighborhood. Bill leaned in toward the other thief, his voice cracking slightly under the immense tension of the operation while the moon phase hovered silently above them in the dark night sky without any noise whatsoever to disturb the plan. This was the critical moment where everything would go terribly wrong or perfectly right depending on the specific timing of the delivery of the stolen goods from the safe location. The wind blew gently across the grass, moving the leaves and the flags on the pole outside the house. Molly said, We go when the mom leaves the house to get the candles. Bill asked, Why candles for us? Molly said, To distract the dog that waits by the gate outside. Bill looked at the yard where the dog was tied to a tree, whining loudly. Is that a bulldog? It looks like a gremlin in a wig. Molly said, Focus. We need the big box with the drone and night vision. Bill muttered, Laser drones are against the rules. Molly said, Then why are we here? Bill asked, Just toys? Robbing a kid? Molly said, High end. Rich kids. Not candy. Bill said, Parents? They have guns. Molly said, The dad has a cat. The cat is the guard. Bill sighed. I don’t like this. Molly said, Yes. Silent. Fast. Bill said, Move. No time. Molly said, Okay. Ready. Bill said, Grab bag. Molly said, Got it. Heavy. Bill said, Watch cat. Molly said, Watching cat. Bill said, Jump. Molly said, Jump. They ran. The cat screamed. Bill fell. The night sky watched silently. Nothing happened at all. The grass was green. The box was yellow. The plan was finished.
created_at
"2026-02-25T22:21:33.206Z"
2/25/2026, 10:21:33 PM
model
"qwen/qwen3.5-flash-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
647
moderation_latency
(null)
generation_time
69336
tokens_prompt
77
tokens_completion
8724
native_tokens_prompt
76
native_tokens_completion
8619
native_tokens_completion_images
(null)
native_tokens_reasoning
7993
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0034552
router
(null)
provider_responses
0
endpoint_id
"5f6b71b5-8672-4b27-8b2d-f674429f9e63"
id
"chatcmpl-383934ab-51e4-9809-818f-24c213a5cc22"
is_byok
false
latency
647
model_permaslug
"qwen/qwen3.5-flash-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772058093-bnpCsDHXQ8NqYoilcYEO"
upstream_id
"chatcmpl-383934ab-51e4-9809-818f-24c213a5cc22"
total_cost
0.0034552
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
35.00%
Matches word count
n/a
neededClean
false
words
482
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
483
wordsDialogue
0
17.5011%