NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBJ278CSXTVFMY0W4B2FH3Z
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
40.5s
Cost:
$0.019074
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled of stale coffee and desperation. Molly spread the blueprints across the dashboard, illuminated by a single flickering overhead light. She tapped a red circle drawn around the backyard. The engine idled with a rhythmic clatter in the background. Rain began to tap against the windshield. This is the extraction point, she said, pointing with a grease-stained finger. Bill squinted at the paper, adjusting his beanie. It's a trampoline, Mol. You want us to jump out of a trampoline? No, we jump on it. It launches us over the fence. Simple physics. And the guards? Parents. Parents? Bill leaned back, cracking his knuckles. I thought we were hitting a jewelry store. The payout is better here. High volume, low security. Everyone is distracted by cake. Bill nodded slowly, absorbing the logic. Okay. What about the loot? Everything wrapped in shiny paper goes in the duffel bags. Even the LEGO sets? Especially the LEGOs. Small, high value, easy to fence. Molly rolled her eyes. We aren't fencing LEGOs. We're taking the electronics and the gift cards. What if a kid wakes up? Then we bribe them with candy. Bill admitted he lacked candy. You should have packed candy, Bill. This is a heist, not a robbery. Bill sighed, looking out the window at the suburban street. It feels wrong, stealing from children. It's not stealing. It's redistributing wealth from the squeaky clean to the desperate. That's not how economics works. It is tonight. Molly grabbed her gear bag. Check your silencer. I don't have a silencer. You don't need one. We're not killing anyone. We're just taking the PS5s. Bill agreed about the PS5s. He patted his pocket. I have a bag of gummy bears. Perfect. Put them in the tactical vest. They climbed out of the van, moving silently across the lawn. The house was lit up with balloons. Music drifted through the open window. Molly checked her watch. The tires hissed on the wet pavement. Ready? Molly whispered. As I'll ever be. Remember, if the dog barks, you throw a stick. Got it. They crept toward the back door. Molly picked the lock in three seconds. They slipped inside, dodging a pile of wrapping paper. Look at that, Bill whispered, pointing at a mountain of boxes. It's a goldmine. Grab the top layer. Leave the bottom. We don't want to make it obvious. This is too easy. Don't jinx it. A small voice called out from the hallway. Who's there? Molly froze. Bill dropped a box. Run, she hissed. They bolted for the window, scrambling onto the trampoline. They bounced once, twice, and landed in the neighbor's hydrangeas. The hydrangeas were thick and thorny. We got the loot, Bill panted. We left the LEGOs, Molly groaned. Priorities, Mol. Priorities.
created_at
"2026-02-25T23:27:35.696Z"
2/25/2026, 11:27:35 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
487
moderation_latency
(null)
generation_time
40369
tokens_prompt
77
tokens_completion
5962
native_tokens_prompt
76
native_tokens_completion
5951
native_tokens_completion_images
(null)
native_tokens_reasoning
5264
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0190736
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-016befee-0406-908b-8e07-cf5d8cae2987"
is_byok
false
latency
487
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772062055-jXUcRrlY3f4iojJo7vi1"
upstream_id
"chatcmpl-016befee-0406-908b-8e07-cf5d8cae2987"
total_cost
0.0190736
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
462
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
463
wordsDialogue
0
0.0000%