NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6ZXT104T5XCJ8Z0B5Y93FE
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
2m 42s
Cost:
$0.025761
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
68.3935%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints of the suburban backyard sprawled across the wobbly folding table, weighed down by stale donuts and a stolen walkie talkie. Molly traced the perimeter fence with a paint-stained finger, noting every sprinkler head and camera blind spot. Bill adjusted his oversized party hat, trying to look professional while failing badly. The target was simple in theory and absurd in practice. A birthday celebration featuring three dozen screaming children and forty wrapped boxes of plastic treasure. They needed a distraction that would clear the patio without triggering every anxious parent inside. Molly drafted a smoke canister disguised as a clown nose. Bill suggested singing showtunes off key. Neither plan survived contact with reality. The sun dipped below the oak trees, casting long shadows over the inflatable bouncy castle. Every minute closer to dusk meant tighter security and hungry balloons. They checked their gloves, their heavy harnesses, and their questionable moral compass. The clock on the dashboard ticked toward midnight. Cut the power at the fountain. Molly whispered. You grab the sled while I crawl through the gift pile right now. Bill agreed. The engine coughed once before dying completely. Their heist had finally begun tonight. Leaving no trace behind them.
created_at
"2026-04-27T08:12:08.104Z"
4/27/2026, 8:12:08 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
700
moderation_latency
(null)
generation_time
161510
tokens_prompt
77
tokens_completion
8954
native_tokens_prompt
76
native_tokens_completion
10721
native_tokens_completion_images
(null)
native_tokens_reasoning
10471
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0257608
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-cfddd6d4-6802-945e-b7ac-c1c36407d36a"
is_byok
false
latency
700
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777277528-dQltTCre7IoaTbNWR4SQ"
session_id
(null)
api_type
"completions"
id
"gen-1777277528-YAUMMW4D88vRneytE20E"
upstream_id
"chatcmpl-cfddd6d4-6802-945e-b7ac-c1c36407d36a"
total_cost
0.0257608
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
201
36.79%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 10.00%
neededClean
false
wordsTotal
202
wordsDialogue
0
68.3935%