NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBKJJBGYZFWWAY9JNXTTES4
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
1m 56s
Cost:
$0.034670
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
87.39%
Dialogue to Total Word Ratio
(details)
93.6948%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly checked her watch. "We have exactly ten minutes before the piƱata drops." Bill adjusted his mask. "Is it safe?" "Safe? It's a six-year-old's birthday." "The security system?" "Motion sensors in the playroom." "Can we bypass them?" "I brought a bag of marbles." "Marbles?" "For the sensors. They think it's a cat." "What about the parents?" "They're drunk on punch." "And the kids?" "They're too busy fighting over the cake." "So we just walk in?" "Bag everything. Leave the wrapping paper." "What about the toys?" "We take the expensive ones." "Like the Lego set?" "And the hoverboard." "Is that ethical?" "It's a heist, Bill. Ethics are for tax auditors." "What if they cry?" "Then we leave them a note." "What does it say?" "Sorry, Santa was broke." "That's dark." "It's efficient." "Let's move." "Wait, do I have the crowbar?" "No, we're using a butter knife." "Why?" "It's a party. Be polite." "And the getaway car?" "It's a tricycle." "You're kidding." "Speed is relative." "Fine. I'll drive." "Don't knock over the cake." "I promise." "Good." "Go." "Wait, what about the dog?" "It's a Chihuahua." "Can it bark?" "Only when it's hungry." "We feed it first." "That's a good plan." "Now, let's go."
created_at
"2026-02-25T23:53:59.927Z"
2/25/2026, 11:53:59 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
388
moderation_latency
(null)
generation_time
115847
tokens_prompt
77
tokens_completion
7940
native_tokens_prompt
76
native_tokens_completion
10825
native_tokens_completion_images
(null)
native_tokens_reasoning
10456
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0346704
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-578021b6-304c-9375-bced-08cae30b9b62"
is_byok
false
latency
388
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772063639-LEIY3lpSwZdWXx72qn2z"
upstream_id
"chatcmpl-578021b6-304c-9375-bced-08cae30b9b62"
total_cost
0.0346704
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
201
87.39%
Dialogue to Total Word Ratio
Ratio: 96.06%, Deviation: 6.06%
neededClean
false
wordsTotal
203
wordsDialogue
195
93.6948%