NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6YZXMS24JTT0QFN7YZAJM3
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
3m 29s
Cost:
$0.029046
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
50.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled of damp cardboard and expired ambition. Bill traced the blueprints with a grease-stained finger while Molly adjusted her fake mustache in a cracked mirror. They were professionals, technically. Just not professionals who usually targeted seven-year-old birthday bashes. The plan was already sketched in red marker across the peeling wall. It involved glitter bomb diversions, a disguised cake, and exactly one escape bicycle. Bill checked his watch and sighed. The timeline was brutal. The warehouse floor was cold and gritty. They packed their gear in silence. You really think the parents will buy the clown act, Bill asked. The balloon arch is going to collapse within minutes. We need something more convincing. I spent three weeks practicing balloon animals with actual latex, Molly replied. Children have zero discernment. They will follow any adult holding a squeaky hammer straight into the gift zone. We slip through the side gate during the cake speech, Molly explained. I will create a distraction near the jump house while you haul the loot toward the utility truck. Move fast and stop admiring the Lego sets. What if the nanny catches us, Bill pressed. She looked like a retired marathon runner in those promotional photos. I do not want another wrestling match in a floral dress. We hit the tree right after the games. I will handle the heavy boxes while you watch the windows. Move quietly and never drop the ribbon rolls. Bill shook his head and rubbed his temples. The logistics were already giving him a migraine. He had mapped out the driveway, the kitchen window, and the exact trajectory of a runaway piƱata. Every variable was accounted for except the inevitable sugar rush. The children would become a swarm of sticky velociraptors within twenty minutes. Molly knew this too. She had brought rubber gloves and a trash bag thick enough to contain a small elephant. They needed precise timing. Just remember the rule about the edible centerpieces, Molly added. We only take the wrapped items. No cake knives, no favor bags, and absolutely zero sprinkles. Understood, Bill muttered, slipping on his oversized clown shoes. I will stick to the paper and cardboard. Just hand me the master key and pray the dog stays outside. The neon sign above the door flickered with a dying buzz. Bill zipped his canvas duffel and tested the zipper twice. Molly smoothed her polyester clown collar and checked her reflection in the stainless steel sink. They reviewed the exit routes one final time. The back alley was clear. Everything depended on perfect synchronization. They grabbed their gear and marched toward the exit. Dawn would bring chaos, plastic wrappers, and a completely empty gift room. The clock ticked loudly. No turning back now. Remember the timing window, Molly commanded. We have exactly four minutes after the clown leaves. I will cut the perimeter wire while you load the sled. Do not hesitate and never look back at the cake. The truck will wait near the fence. Now.
created_at
"2026-04-27T07:56:15.943Z"
4/27/2026, 7:56:15 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
717
moderation_latency
(null)
generation_time
182043
tokens_prompt
77
tokens_completion
9700
native_tokens_prompt
76
native_tokens_completion
12090
native_tokens_completion_images
(null)
native_tokens_reasoning
11478
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0290464
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-3f95bcdc-b139-982a-aab6-69eabd53bda3"
is_byok
false
latency
717
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777276575-MVu5WgOPyaxww8joc42j"
session_id
(null)
api_type
"completions"
id
"gen-1777276575-NKjlCBbxHPSajJmL97nr"
upstream_id
"chatcmpl-3f95bcdc-b139-982a-aab6-69eabd53bda3"
total_cost
0.0290464
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
503
wordsDialogue
0
50.0000%