NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBKWZ75YT0X2RGFB2J04PCV
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
1m 56s
Cost:
$0.036635
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
99.99%
Dialogue to Total Word Ratio
(details)
99.9567%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"The piƱata is the distraction," Bill whispered, adjusting his tie nervously. "And the presents?" Molly asked, checking her expensive watch. "Bag them while they cry," Bill grinned, tapping the folded map. "Is that ethical for us?" "It's a heist, not a seminar." "What about the chocolate cake?" "Leave the cake. Sugar crashes are dangerous." "I need a Lego set." "You're thirty years old." "I know. The box says eight." "Just grab the Nintendo Switch." "What if parents see us?" "We wear clown masks." "Too obvious for this neighborhood." "We wear party hats." "Better. Do we have a getaway car?" "The bumper car." "That doesn't move fast." "It has momentum." "Fine. But I want the doll." "Take the doll. Just don't wake the baby." "The baby is asleep?" "He's a toddler. They nap like monks." "Okay. Meet at the bounce house." "Bring the sacks." "I brought the sacks." "Good. Let's ruin childhood." "Wait, what about the balloons?" "Pop them. Make noise." "Perfect. See you there." "Don't forget the napkins." "For what?" "To wipe the tears." "Very dark." "Are you sure the clown masks work?" "Yes, they have excellent visibility." "Okay. I'm ready." "Me too." "Let's get toys." "Tonight."
created_at
"2026-02-25T23:59:40.778Z"
2/25/2026, 11:59:40 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
12701
moderation_latency
(null)
generation_time
116109
tokens_prompt
77
tokens_completion
8998
native_tokens_prompt
76
native_tokens_completion
11439
native_tokens_completion_images
(null)
native_tokens_reasoning
11105
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0366352
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-f0b3bda6-0de6-9985-80af-f450ba2054b4"
is_byok
false
latency
12701
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772063980-glKAygh0NTMOCmnYCZsi"
upstream_id
"chatcmpl-f0b3bda6-0de6-9985-80af-f450ba2054b4"
total_cost
0.0366352
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
197
99.99%
Dialogue to Total Word Ratio
Ratio: 90.86%, Deviation: 0.86%
neededClean
false
wordsTotal
197
wordsDialogue
179
99.9567%