NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6ZWHHXP5ND1XYPKNG3N1BT
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
2m 5s
Cost:
$0.019799
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
99.9595%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill adjusted his glittering tiara over a balaclava and smoothed the crumpled blueprint drawn on a grape juice box with grease-stained fingers. He crouched behind the hedge, knees trembling not from fear but from low blood sugar. Molly sipped her juice box through a reinforced straw, eyes scanning the backyard like a missile guidance system. She checked the tension on her slingshot, loaded with foam rubber chickens. "The target is the Smith residence. Loot in the living room fortress. Hostiles include a piƱata and a clown named Giggles." Bill snorted, juice misting his mustache. "Giggles is a wildcard. Does he possess night vision?" "He's distracted by balloon animals. The real danger is the cake. Calorie-dense traps everywhere." Molly cracked her knuckles like popping bubble wrap. "I'll neutralize the clown while you grab the loot. How much are we talking?" Bill grinned, revealing a gold teddy bear tooth. "Three hundred dollars in gift cards, a drone, and the legendary LEGO Death Star prototype." Molly nodded slowly. "Risky. But for that LEGO set, I'd steal from my own grandma. When do we strike?" Bill checked his fused digimon watch. "When the juice boxes run dry. Chaos reigns. We strike."
created_at
"2026-04-27T08:11:26.659Z"
4/27/2026, 8:11:26 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1164
moderation_latency
(null)
generation_time
124693
tokens_prompt
77
tokens_completion
7228
native_tokens_prompt
76
native_tokens_completion
8237
native_tokens_completion_images
(null)
native_tokens_reasoning
7958
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0197992
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-eeb09d1b-5287-9542-9b5d-4e89a1d505d3"
is_byok
false
latency
1164
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777277486-Gr035bWD2rmLR9JPsNu5"
session_id
(null)
api_type
"completions"
id
"gen-1777277486-Rri6knec8q6NW8YNjjLZ"
upstream_id
"chatcmpl-eeb09d1b-5287-9542-9b5d-4e89a1d505d3"
total_cost
0.0197992
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
197
100.00%
Dialogue to Total Word Ratio
Ratio: 50.25%, Deviation: 0.25%
neededClean
false
wordsTotal
199
wordsDialogue
100
99.9595%