NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK2YX5PAH80PD37S26HTA2C
Run Details
Model:
Qwen 3.5 397B A17B
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
3m 27s
Cost:
$0.028133
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the ninja mask, though it smelled faintly of cheese. Bill checked the lockpicks, sighing heavily as a clown balloon drifted past the van window. This was rock bottom, even for them. The target lay beyond the hedge, a fortress of wrapping paper and sugar highs. They needed precision. One wrong move and the parents would descend like hungry sharks. Molly pointed at the birthday boy blowing out candles through the window. Bill nodded, gripping the sack tight. It felt wrong, yet the rent was due. They slipped over the fence, avoiding the inflatable castle. The grass was damp. Inside, laughter echoed. They crept toward the tree. Bill whispered. "Is that a gift card in the box right now?" Molly shook her head silently in the dark. She said. "Take the Lego sets only, ignore the cards completely now." They stuffed the bags. No alarms triggered. Nothing barked. They vanished into the night leaving only crumbs. The job was done. Shame washed over them. But money talked louder than conscience tonight. They drove. The radio played happy birthday. Bill turned it off. Neither spoke for the rest of the drive home. It was. Silence suited criminals best. Stealing toddlers.
created_at
"2026-02-16T11:25:19.024318+00:00"
2/16/2026, 11:25:19 AM
model
"qwen/qwen3.5-397b-a17b-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
795
moderation_latency
(null)
generation_time
206732
tokens_prompt
77
tokens_completion
5587
native_tokens_prompt
76
native_tokens_completion
7802
native_tokens_completion_images
(null)
native_tokens_reasoning
7540
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0281328
router
(null)
provider_responses
0
id
"chatcmpl-eacbbdaf-2cc2-9e26-bc1b-e9c730e25fee"
status
200
is_byok
false
latency
795
endpoint_id
"180484a2-e24b-495a-ba41-b34b680d5a9f"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-397b-a17b-20260216"
api_type
"completions"
id
"gen-1771240912-ok5emjQ4x5b50GJn2TTI"
upstream_id
"chatcmpl-eacbbdaf-2cc2-9e26-bc1b-e9c730e25fee"
total_cost
0.0281328
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
100.0000%
Matches word count
n/a
neededClean
false
words
200
100.0000%
Dialogue to Total Word Ratio
Ratio: 10.00%, Deviation: 0.00%
neededClean
false
wordsTotal
200
wordsDialogue
20
100.0000%