NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBGX416HWHGN8FWRG6SXP49
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
2m 14s
Cost:
$0.058981
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
85.50%
Dialogue to Total Word Ratio
(details)
42.7493%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her heavy black mask while Bill checked his lockpicks under the flickering streetlight. Bill grinned wickedly. Molly sighed heavily. She felt like absolute monsters today. They were professionals. The kids got toys, they got cash. There were no cash registers at a birthday party. He pointed at the tree. Under the tree. Wrapped gifts. High value. Molly laughed softly. Bill moved toward the window. Molly followed. They slipped into the yard. Molly kicked the ball. They crept toward the porch. “That’s sugar, not a vault,” she whispered nervously. “Exactly. Sugar rush means distraction.” “I feel like monsters.” “We’re professionals.” “No cash registers.” “What if they catch us?” “Then we tell them we’re the clowns.” “You look terrible in red.” “It’s a disguise.” “I’m in.” “Deal.” “Ready?” “Born ready.” “Wait, is that a dog?” “Ignore it.” “It’s barking.” “Shh.” “Okay, moving.” “This feels wrong.” “Money feels good.” “True. Let’s go.”
created_at
"2026-02-25T23:07:19.979Z"
2/25/2026, 11:07:19 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
458
moderation_latency
(null)
generation_time
134008
tokens_prompt
77
tokens_completion
17030
native_tokens_prompt
76
native_tokens_completion
18422
native_tokens_completion_images
(null)
native_tokens_reasoning
18182
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0589808
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-29528ec1-2e04-9745-a681-b4bdf1da9625"
is_byok
false
latency
458
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772060839-5wbR8ai1WNKTeTMovDHJ"
upstream_id
"chatcmpl-29528ec1-2e04-9745-a681-b4bdf1da9625"
total_cost
0.0589808
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
151
85.50%
Dialogue to Total Word Ratio
Ratio: 43.71%, Deviation: 6.29%
neededClean
false
wordsTotal
151
wordsDialogue
66
42.7493%