NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBK4SFCK65ZJS69VY3K1PEX
Run Details
Model:
Qwen 3.5 27B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 35s
Cost:
$0.018704
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
20.19%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
10.0948%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the zoom on her binoculars. The suburban house across the street pulsed with neon pink light. Inside, a cacophony of shrieking children signaled the peak of the celebration. She lowered the glass and turned to Bill, who was currently trying to hotwire a helium balloon with a pocket knife. "Do you see the clock?" "It's blinking red. We're out of time." "We have time." "It's a birthday party, Bill. Not a bank." "Semantics, Molly. The security is just as tight. Three parents, one dog, and a cake with a candle." Bill tossed the balloon onto the passenger seat. It deflated with a sad squeak. "The prize pool is substantial. Limited edition action figures. Rare LEGO sets. The resale market on a mint-condition Transformers toy is better than gold bullion." Molly sighed, pulling a crumpled map from her glovebox. "You want to rob a seven-year-old? The loot is going to be covered in glitter. We'll be tracking confetti for weeks." "Glitter is a hazard, but it's not a deterrent." Bill leaned forward, his eyes gleaming with the thrill of the job. "Think about the logistics. The parents are distracted by the piƱata. The dog is asleep on the rug. We have a window of exactly twelve minutes." "Twelve minutes to bypass the laser grid?" "There is no laser grid. There is a plastic playpen." Bill grinned. "It's child-proof, not thief-proof. I can climb over it in my sleep. The real challenge is the exit strategy. We can't drive off in the van with a trunk full of My Little Pony dolls. We'll look suspicious." "We'll load them into the diaper bag." "Brilliant. Who are we pretending to be?" "Aunts. Very generous aunts who hate the kids." Bill nodded slowly. "I'll wear the wig. You take the onesies. We need to blend in." "I am not wearing a onesie, Bill. I have a reputation." "It's for the job, Molly. Besides, the security camera is facing the front door. We need to come in the back through the play tent." "The play tent?" "It's the weak point. No locks. Just Velcro." Bill tapped the blueprint. "We go in, grab the presents, and exit before the cake is cut. If the cake is cut, the parents get serious. That's when they check the windows." "What about the noise?" "We wear noise-canceling headphones. We'll be deaf to the screaming." "That's not how noise-canceling works." "It will in my head." Bill started the van. The engine rumbled like a sleeping beast. Bill wiped grease from his cheek. "Ready?" Molly looked at the house. The laughter drifted through the open windows. It sounded innocent. It sounded naive. It sounded like an easy score. She adjusted her turtleneck. Molly gripped the steering wheel. The van idled quietly. "Let's do it." "For the toys." "For the toys." "The kid is named Kevin. He's notoriously picky." "Then we take everything he doesn't want." "Perfect. I hate Kevin." "Do we have a getaway plan if the dog wakes up?" "The dog is a pug. He won't wake up." "Pugs are surprisingly agile." "Not this one." Molly checked her watch.
created_at
"2026-02-25T23:46:28.472Z"
2/25/2026, 11:46:28 PM
model
"qwen/qwen3.5-27b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
523
moderation_latency
(null)
generation_time
94519
tokens_prompt
77
tokens_completion
8007
native_tokens_prompt
76
native_tokens_completion
7784
native_tokens_completion_images
(null)
native_tokens_reasoning
7038
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0187044
router
(null)
provider_responses
0
endpoint_id
"f3f88554-0c22-4d57-973b-0dec948808e8"
id
"chatcmpl-7ce4c667-33d6-984e-8b8d-0f4a1194a50f"
is_byok
false
latency
523
model_permaslug
"qwen/qwen3.5-27b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772063188-CHEjZlsoUPv8Ravyjeql"
upstream_id
"chatcmpl-7ce4c667-33d6-984e-8b8d-0f4a1194a50f"
total_cost
0.0187044
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
20.19%
Matches word count
n/a
neededClean
false
words
520
0.00%
Dialogue to Total Word Ratio
Ratio: 70.97%, Deviation: 20.97%
neededClean
false
wordsTotal
527
wordsDialogue
374
10.0948%