NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBDKQJ883G183C5VQQPJAB1
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 1s
Cost:
$0.032366
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
1.04%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.5182%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly checked the blueprints on her tablet. The backyard was a minefield of plastic balls and parental supervision. The heat was oppressive. Sweat dripped down her nose. She wiped it away. The van idled nearby. “We go in at the cake cutting,” she said. “Distraction point.” Bill adjusted his mask. It was a balaclava with cat ears sewn on. He tugged the fabric. It itched. “Are we sure about this? They’re six-year-olds.” “Six-year-olds have the best toys,” Molly replied. “Legos, drones, tablets. It’s a goldmine.” “It’s a moral minefield,” Bill countered. “What if they cry?” “Then we give them a high-five and vanish.” Bill sighed. “I didn’t sign up for emotional damage.” “You signed up for the payout,” Molly snapped. “Look, the target is the red box under the tree. That’s the PS5.” “I thought we were stealing the piñata,” Bill said. “The candy is liquid assets.” “Candy rots. Electronics hold value. We need the PS5.” Bill looked at the house. Balloons bobbed in the wind like colorful warnings. The music was loud. He felt nervous. The lights were bright. “What about the parents? They’re armed with lawn chairs.” “We’re invisible,” Molly said. “Like ninjas. But with better shoes.” “I’m wearing Crocs,” Bill noted. “Perfect. Silent entry.” Bill rubbed his chin. “Okay, but if I get caught, I’m telling them I’m a magician.” “A magician who steals consoles?” “It’s a trick. I make the toys disappear.” Molly laughed. “Just don’t make yourself disappear. We need a driver.” “I can drive,” Bill said. “I have a license.” “You have a learner’s permit, Bill.” “It counts in the state of emergency.” Molly stood up, brushing crumbs off her tactical vest. Her knees cracked. She stood tall. “Alright. Phase one: infiltrate the bounce house.” “I’m not jumping in a bounce house,” Bill said. “I’m forty-two.” “You’re a professional.” “I’m a middle-aged accountant with a mask.” “You’re a thief with a mask. Same thing.” Bill stood up slowly. His back hurt. He winced. “If the mom asks for a selfie, I’m running.” “No selfies. We’re ghosts.” “Ghosts don’t wear neon windbreakers.” “It’s camouflage,” Molly lied. “Against the grass.” Bill looked at the green grass. It was wet. He stepped on a bug. “It’s neon orange.” “It blends with the sunset,” Molly said. “The sun set an hour ago.” “Focus, Bill. The cake is burning.” “The cake is already gone. We’re late.” “Then we go in hot. Grab the loot, get out.” “Leave it. It’s a trap.” “For the parents. They know we’re coming.” Bill froze. “How do you know?” Molly smiled. “Because I sent the invitation.” Bill blinked. “You invited us to our own heist?” “It’s called research.” “That’s insane.” “It’s genius.” “I hate this job.” “You love the money.” “I love the money.” “Then move.” They crept toward the house. The grass was wet. They moved silently.
created_at
"2026-02-25T22:09:46.574Z"
2/25/2026, 10:09:46 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
445
moderation_latency
(null)
generation_time
60698
tokens_prompt
77
tokens_completion
10756
native_tokens_prompt
76
native_tokens_completion
10105
native_tokens_completion_images
(null)
native_tokens_reasoning
9348
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0323664
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-d9f6621f-500c-97b6-9c63-92ab0753d271"
is_byok
false
latency
445
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772057386-AwKYMBFysoU7becFMhnX"
upstream_id
"chatcmpl-d9f6621f-500c-97b6-9c63-92ab0753d271"
total_cost
0.0323664
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
1.04%
Matches word count
n/a
neededClean
false
words
474
0.00%
Dialogue to Total Word Ratio
Ratio: 65.76%, Deviation: 15.76%
neededClean
false
wordsTotal
479
wordsDialogue
315
0.5182%