NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBE24FPKW605BMGJPR78AAB
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
1m 56s
Cost:
$0.053726
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
6.09%
Matches word count
(details)
95.04%
Dialogue to Total Word Ratio
(details)
50.5666%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her black balaclava carefully. “The security is tight, Bill.” Bill checked his wristwatch nervously. “It’s a six-year-old’s birthday party.” “Exactly. He eats everything.” She pointed at the crumpled invitation. “We go in during cake.” Bill sighed heavily and looked down. “I am not stealing candy from children.” “Not candy. The Lego set. The PS5.” “It is for a kid named Timmy.” “Timmy does not need a console. He needs a nap.” “What if he cries loudly?” “Then we bribe him with gum.” Bill rubbed his chin thoughtfully. “Is gum enough for a felony?” “It is gum.” He nodded very slowly. “Okay. But if the dog bites me, I pay the vet.” “Deal. Grab the sack.” They moved quickly toward the open window. The house was dark except for the glowing Christmas tree. “Ready?” Molly whispered quietly. “Born ready.” They slipped inside silently. The smell of frosting hit them immediately. “Smells like victory,” Bill said. “And sugar,” Molly corrected. They began stacking boxes rapidly. “One more,” Bill urged. “Done.” They vanished into the very cold night.
created_at
"2026-02-25T22:17:38.554Z"
2/25/2026, 10:17:38 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
403
moderation_latency
(null)
generation_time
115414
tokens_prompt
77
tokens_completion
10771
native_tokens_prompt
76
native_tokens_completion
16780
native_tokens_completion_images
(null)
native_tokens_reasoning
16516
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0537264
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-4164f948-5800-9a8f-b473-32023267a4ef"
is_byok
false
latency
403
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772057858-dYkELvcx0JFuW537kzoO"
upstream_id
"chatcmpl-4164f948-5800-9a8f-b473-32023267a4ef"
total_cost
0.0537264
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
6.09%
Matches word count
n/a
neededClean
false
words
177
95.04%
Dialogue to Total Word Ratio
Ratio: 54.75%, Deviation: 4.75%
neededClean
false
wordsTotal
179
wordsDialogue
98
50.5666%