NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBDH277G11VYR4M15GV5CKS
Run Details
Model:
Qwen 3.5 27B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
1m 38s
Cost:
$0.026190
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
50.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The security is a balloon arch, Molly whispered, adjusting her wig. It is child's play. It's a clown, Molly. He's got a megaphone, a unicycle, and a lot of confetti. We wait for the cake cutting. What about the parents? They're distracted by the games. They're drinking punch. They're absolutely completely useless. I don't want to steal a tricycle. It is just too bulky. It's about the principle, Bill. Plastic dinosaurs, robots, and action figures. I'm thinking of the getaway. Where do we hide it? The clown car. Outside the garage. It smells like old socks and popcorn. It fits. Stack them high. What if the birthday boy wakes up? He's asleep. He had too much sugar. We need gloves. I brought napkins. For fingerprints. And cake. Napkins? Eating while robbing? You're insane. I'm efficient. Hunger kills focus. The clown is waving. He sees us. Now. Grab the toys. What about the piƱata? Loot? Leave it. Too much noise. Messy. Okay. After the cake. On the whistle. After the cake. Don't trip on the streamers. Wait, is that glitter on my jacket? It's everywhere. I hate glitter. It ruins the resale value. Welcome to the heist. Sticky. Let's go. Go.
created_at
"2026-02-25T22:08:19.183Z"
2/25/2026, 10:08:19 PM
model
"qwen/qwen3.5-27b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
446
moderation_latency
(null)
generation_time
98267
tokens_prompt
77
tokens_completion
6934
native_tokens_prompt
76
native_tokens_completion
10903
native_tokens_completion_images
(null)
native_tokens_reasoning
10580
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.02619
router
(null)
provider_responses
0
endpoint_id
"f3f88554-0c22-4d57-973b-0dec948808e8"
id
"chatcmpl-1a58d20a-ae6d-9dcf-a2cc-b503f58b8920"
is_byok
false
latency
446
model_permaslug
"qwen/qwen3.5-27b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772057299-J287BMAoAMuvX3O8BE8r"
upstream_id
"chatcmpl-1a58d20a-ae6d-9dcf-a2cc-b503f58b8920"
total_cost
0.02619
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 90.00%
neededClean
false
wordsTotal
200
wordsDialogue
0
50.0000%