NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBDEMJGB8RRB9DN5YP222NV
Run Details
Model:
Qwen 3.5 Flash
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
2m 29s
Cost:
$0.007612
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her bright neon wig behind the hydrangeas, breathing hard. She scanned the windows for the stream glow. Bill flipped his silver clipboard, hair glittering. Are you really sure about this? It feels like my belt, she whispered quietly. He pointed toward the lawn where a lawn gnome stood. The security system is a motion sensor that smells like baby wipes. said Bill. Crayons are goal. Molly frowned deeply, a frown of despair. But the parents will be there. asked nervously. Parents are predictable. They are drinking punch and ignoring the noise. We need gifts under tree before wake up. The big ones go in backpack. explained calmly. What if the dog barks? I fed it sedative biscuits. Thinks we really are delivery man. Molly squinted through the leaves at the colorful lump near the door. Is that a piƱata? asked again. Distraction. Smash, chaos, we slip in. And after? We will vanish like ninjas, covered in glitter, added. Bill cracked his knuckles, echoing softly in the quiet yard. Ready? said. Together they crept forward toward the front door. The moon was very bright.
created_at
"2026-02-25T22:06:59.669Z"
2/25/2026, 10:06:59 PM
model
"qwen/qwen3.5-flash-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
478
moderation_latency
(null)
generation_time
149118
tokens_prompt
77
tokens_completion
14623
native_tokens_prompt
76
native_tokens_completion
19010
native_tokens_completion_images
(null)
native_tokens_reasoning
18760
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0076116
router
(null)
provider_responses
0
endpoint_id
"5f6b71b5-8672-4b27-8b2d-f674429f9e63"
id
"chatcmpl-8a45fa17-7466-9e96-a27b-3fe271b893aa"
is_byok
false
latency
478
model_permaslug
"qwen/qwen3.5-flash-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772057219-RDZGcRfr02SI9Gmd5IYv"
upstream_id
"chatcmpl-8a45fa17-7466-9e96-a27b-3fe271b893aa"
total_cost
0.0076116
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
185
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
185
wordsDialogue
0
30.1376%