NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBF1HHG9GGG954WNA7EC68Z
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
58.6s
Cost:
$0.032338
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the hood of the rusted van. "So, the piñata is the distraction?" she asked, tracing a red circle on the paper with a gloved finger. Bill adjusted his gloves, checking the tension. "Exactly. We burst it, candy flies, chaos ensues. Parents will be too busy shooing toddlers to notice the theft." "And the presents?" "Siphon them into the duffel bags while the parents scream." "It feels wrong, Bill." "It's a heist, Molly. Feelings are for the mark." "They're seven-year-olds, Bill. They're vulnerable." "They'll have new ones by Christmas." "It's July." "Semantics." "What about the security?" "A mom with a iPad and a dad holding a beer. That's it. No alarms." "Are we sure about the laser grid?" "It's a bouncy castle. No lasers. Just plastic walls." "Right. The bouncy castle." "We need to time the cake cutting." "Why?" "Sugar rush. They'll be running around like maniacs. Perfect cover." "I brought the crowbar." "For what? The gift wrap?" "The door lock." "It's a sliding glass door." "Right. The crowbar is useless." "I like the crowbar." "Put it away." "What about the masks?" "We can't wear masks at a kids' party. They'll panic." "We're wearing clown noses." "That's better." "Does this mean I'm a clown?" "You're a clown thief." "I hate clowns." "Then wear a mask." "The kids will cry." "If we don't steal the presents, they'll cry anyway." "That's a terrible argument." "It's logic." "Okay, the plan is solid." "Except the part about the piñata." "Why?" "What if the parents hit us?" "With a stick?" "With a beer bottle." "We wear helmets." "Under the clown noses?" "Exactly." "This is insane." "It's a Tuesday." "Right. Tuesday." "Let's go." "Wait, do we have the duffel bags?" "In the trunk." "Good." "Bill?" "Yeah?" "If we get caught, we say we're the entertainment." "The entertainment steals the loot?" "Magicians steal things." "We're not magicians." "We're clowns." "Okay. Clowns it is." "Wait, what about the balloons?" "Pop them?" "No, use them." "How?" "Carry the loot." "Brilliant." "Let's move."
created_at
"2026-02-25T22:34:47.734Z"
2/25/2026, 10:34:47 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
430
moderation_latency
(null)
generation_time
58462
tokens_prompt
77
tokens_completion
9927
native_tokens_prompt
76
native_tokens_completion
10096
native_tokens_completion_images
(null)
native_tokens_reasoning
9522
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0323376
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-177b00df-4df5-9a6b-a6a7-fc4a3515e366"
is_byok
false
latency
430
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772058887-mstUVe9RSHGk5loPq7Lg"
upstream_id
"chatcmpl-177b00df-4df5-9a6b-a6a7-fc4a3515e366"
total_cost
0.0323376
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
337
0.00%
Dialogue to Total Word Ratio
Ratio: 90.86%, Deviation: 20.86%
neededClean
false
wordsTotal
339
wordsDialogue
308
0.0000%