NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBJQGYVTQSTE2AHYB16MSFN
Run Details
Model:
Qwen 3.5 27B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 6s
Cost:
$0.019976
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
15.68%
Dialogue to Total Word Ratio
(details)
7.8418%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the strap of her duffel bag, staring at the suburban house through binoculars. Balloons bobbed against the twilight sky, casting colorful shadows on the manicured lawn. The party lights flickered on. "So, the objective is clear?" Bill asked, chewing on a toothpick. "Clear as mud," Molly said. "We go in, we grab the wrapped boxes, we vanish before the cake arrives." "Why not just steal the cake?" "Because cake smudges, Bill. Wrapping paper preserves the value." Bill shook his head, looking at the plastic fence surrounding the yard. "You're obsessed with the resale market for plastic dinosaurs." "It's a niche market with high demand. Parents pay top dollar for unopened toys." "What about the security?" "Security is a mother named Linda who thinks a locked front door is a vault." "And the perimeter?" "Plastic fence. Easy climb." "What if the kids see us?" "They won't. We wear clown costumes." "Clowns?" "It's a party. Clowns are expected. It's camouflage." "I look like a killer clown." "You look like a professional." "I have a unicycle." "Then you ride it." "I don't know how to ride a unicycle." "Practice makes perfect. We have twenty minutes." "Twenty minutes until the cake?" "Twenty minutes until the parents stop watching the kids and start watching the food." "Smart." "I know." "What about the dog?" "The dog is a golden retriever. He loves treats." "You brought treats?" "I brought a bag of hot dogs." "Hot dogs?" "They work better than kibble." "Okay, I'm in. But I want the Lego set." "The Lego set is mine." "It's a Star Wars set." "It's a Star Wars set for a five-year-old. It's missing pieces." "How do you know?" "I checked the catalog online." "You checked the catalog?" "Preparation is key, Bill." "You're weird." "I'm prepared." "Let's go." "Wait." "What now?" "The music." "It's 'Baby Shark'." "It's annoying." "It's a distraction." "I hate it." "Then you sing it louder." "I can't sing." "You can hum." "Fine." Molly holstered her lockpick. Bill put on the red nose. The wind rustled the leaves around them. The air smelled of cut grass and sugar. "Ready?" "Born ready." They walked toward the gate. The grass crunched under their boots. A squirrel watched them from a tree branch. "What about the getaway van?" "It's a minivan." "A minivan?" "It blends in." "With the parents?" "Exactly." "You're a genius." "I try." "Do we have space for the trampoline?" "We don't have space for the trampoline." "It's a bounce house." "It's inflatable." "We deflate it." "We steal the air?" "We steal the plastic." "That's wasteful." "It's efficient." "I prefer value." "I prefer speed." "Speed is value." "Okay." "Okay." "Move out." "Moving." They slipped through the gate into the yard. The party was loud inside.
created_at
"2026-02-25T23:39:13.759Z"
2/25/2026, 11:39:13 PM
model
"qwen/qwen3.5-27b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2900
moderation_latency
(null)
generation_time
125764
tokens_prompt
77
tokens_completion
8102
native_tokens_prompt
76
native_tokens_completion
8314
native_tokens_completion_images
(null)
native_tokens_reasoning
7568
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0199764
router
(null)
provider_responses
0
endpoint_id
"f3f88554-0c22-4d57-973b-0dec948808e8"
id
"chatcmpl-b37f70f9-2d38-9cea-8fe9-9e65665a7fa6"
is_byok
false
latency
2900
model_permaslug
"qwen/qwen3.5-27b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772062753-glMRrecnNp0ErzA4HBZD"
upstream_id
"chatcmpl-b37f70f9-2d38-9cea-8fe9-9e65665a7fa6"
total_cost
0.0199764
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
458
15.68%
Dialogue to Total Word Ratio
Ratio: 76.09%, Deviation: 6.09%
neededClean
false
wordsTotal
460
wordsDialogue
350
7.8418%