NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBDHY87WQR7F06WYCWZH1JK
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
57.0s
Cost:
$0.030475
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
34.0511%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
This is absolutely insane, Bill said, adjusting his tie for the third time. We are stealing from five-year-olds. They have the most concentrated value, Molly replied, checking the blueprint on her tablet. Lego sets. Limited edition dolls. The resale market is booming right now. It is morally wrong. It is logistics. You in or out? I am in. But I need a plan. Okay. Phase one: Distraction. I thought we were doing the silent entry. Parents are too vigilant. We need chaos. A fire extinguisher? Too cliché. The police love that. A piñata explosion? Better. But we need the presents before the piñata breaks. Right. Timing is everything. What about the security system? It is a baby monitor. A what? They monitor the kids via audio. We just need to mimic a crying baby. I can't do a crying baby. You're a grown man. Grow up. This is beneath me. It is beneath the law. Which is why we're doing it. What about the cake? Leave the cake. It is chocolate. Leave the cake, Bill. We are not animals. I am not saying I want the cake. I am saying the cake might be a trap. It is not a trap. It is frosting. Frosting sticks. Just grab the boxes. What if the kids wake up? They won't. They are exhausted from blowing out candles. That is a good point. I always have good points. You usually have bad ideas. This is a good idea. Look at the van. The van is full of balloons. For the cover. We look like party clowns. We are party clowns. Just with guns. We don't have guns. We have water pistols. That is better. Less prison time. Exactly. What if someone sees us? We are invisible. We are wearing neon vests. They are reflective vests. Safety first. Safety from what? From the police. We are stealing from children. We are redistributing wealth. To us. To us. Fine. Let us go. Wait. One more thing. What? Do we leave a thank you note? Absolutely not. I thought we were professionals. We are. Professionals don't leave notes for toddlers. Okay. Moving out. Wait, Bill. What now? I forgot the bag. You forgot the bag? It is in the trunk. Open the trunk. It is locked. I have a key. You have a key? For the trunk. Then why are we talking? I lost the key. Molly. I am kidding. It is in my pocket. You are a nightmare. I am a genius. Let us go steal some toys. Let us go. Wait, the piñata. Yes? I ate it. You ate the piñata? It was hanging there. That was the distraction. I was hungry. We are doomed. We are adaptable. Just open the trunk. Okay. What about the getaway car? Bill asked. It is a minivan. A minivan? It blends in. With balloons? Exactly. Okay. I am ready. Good. Let us move.
created_at
"2026-02-25T22:08:47.903Z"
2/25/2026, 10:08:47 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
483
moderation_latency
(null)
generation_time
56873
tokens_prompt
77
tokens_completion
8933
native_tokens_prompt
76
native_tokens_completion
9514
native_tokens_completion_images
(null)
native_tokens_reasoning
8862
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0304752
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-db34748c-bb49-9418-80f3-a3d7f126cb0f"
is_byok
false
latency
482
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772057327-D9pkLAwsHucAaaelLfkq"
upstream_id
"chatcmpl-db34748c-bb49-9418-80f3-a3d7f126cb0f"
total_cost
0.0304752
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
486
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
488
wordsDialogue
0
34.0511%