NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6ZMM8ZRHHR1J860352GV6C
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
48.5s
Cost:
$0.007665
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9595%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
So, the target is the Johnson estate? Molly whispered, checking her wig. Affirmative, Bill said, adjusting his fake mustache. Primary objective: all under-ten items. Secondary objective: the chocolate fountain. Bill, focus. We are here for the loot, not the lactose nightmare. What's the entry point? The backdoor is guarded by a golden retriever in a referee vest. Tactical canine. I have a hotdog squeaky toy. One-time distraction. Bold. Once past the dog, we hit the gift table. That PlayStation 5 is under a stack of board games. Please tell me it's Monopoly. Worse. Math puzzles. We burn those for the fire alarm? No, ignite the piñata. Candy and structural weakness. While toddlers scramble for sugar, we vacuum the presents. What if the piñata doesn't burn? It's paper-mache, Bill. It burns like a witch. Right. And the clown nose? Disguise adhesive. It stays until acetone. What about parents? You're the aunt from Wisconsin with fake tremors. You create chaos. And you? I'm the entertainer. Hired myself as face-painter. Clown suit under the tux. A clown suit? You look like a marshmallow with a hernia. Breathable fabric! Check your pockets. Gift-bag vacuums ready? Reverse suction? Yes. Let's steal childhoods.
created_at
"2026-04-27T08:07:07.302Z"
4/27/2026, 8:07:07 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
733
moderation_latency
(null)
generation_time
48306
tokens_prompt
77
tokens_completion
2918
native_tokens_prompt
76
native_tokens_completion
3181
native_tokens_completion_images
(null)
native_tokens_reasoning
2859
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0076648
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-51eda199-8627-95e5-bb28-bcfa4be883ed"
is_byok
false
latency
733
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777277227-MpTyx0RsqOFRbrPqGUtw"
session_id
(null)
api_type
"completions"
id
"gen-1777277227-V4rg7XRPUOEiG1HJLOmN"
upstream_id
"chatcmpl-51eda199-8627-95e5-bb28-bcfa4be883ed"
total_cost
0.0076648
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
197
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 90.00%
neededClean
false
wordsTotal
202
wordsDialogue
0
49.9595%