NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBKEAR2QG5DWNKB93T6B2TV
Run Details
Model:
Qwen 3.5 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 8s
Cost:
$0.004417
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.21%
Dialogue to Total Word Ratio
(details)
0.1060%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the roof of the unmarked white van, muffling the noise of the city outside. The air inside smelled like stale coffee and desperation. Molly checked her watch for the twentieth time in five minutes. Her fingers tapped a frantic rhythm on the steering wheel. Bill sat in the passenger seat, staring intensely at a black plastic shopping bag. He was meticulously folding a balaclava into a perfect square. Bill looked at the bag nervously. The kitchen window was their only entry point. The alarm system was old, manufactured before Bill was even born. They needed to move fast. "We do not need money to buy candy," Molly said. "We do not need candy." "Stealing is morally suspect," Bill said. "This plan is ridiculous." "They are going to be lost forever in a drawer," Molly said. "It is a second chance for these toys." "But the parents will be sad," Bill added. "They love those specific items." "They will buy more," Molly said. "Money is not the problem." "The alarm system hummed loudly," Bill said. "It is too risky." "They are blind," Molly said. "The alarms are broken." "They are children," Bill said. "They do not know better." "We run fast," Bill said. "If the dog wakes." "Ten minutes," Molly said. "If the dog wakes, we run." "I am fired," Bill said. "I can pick the lock," Bill said. "With a hairpin." "Go now," Molly said. "Forget the ladder." "Go through the window," Bill said. "It is a balcony," Bill said. "It is a window," Molly insisted. "We are criminals," Bill said. "We are rescuers." "We have to go," Bill said. "We are rich." Molly opened the door. The heavy rain hit her face instantly. She stepped out, grabbing the plastic ladder. Bill stepped out. His clothes soaked with wetness. The porch light flickered on. A child screamed inside. They dashed across the wet pavement toward the house. Molly yelled. "Did you bring the lockpicks?" Bill's face went pale. "I left them in the oven." Molly stared at the door. She was angry. The execution was questionable. The plan was solid. They moved toward the house. They were criminals. They were rich.
created_at
"2026-02-25T23:51:41.066Z"
2/25/2026, 11:51:41 PM
model
"qwen/qwen3.5-flash-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
453
moderation_latency
(null)
generation_time
68338
tokens_prompt
77
tokens_completion
10958
native_tokens_prompt
76
native_tokens_completion
11023
native_tokens_completion_images
(null)
native_tokens_reasoning
10514
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0044168
router
(null)
provider_responses
0
endpoint_id
"5f6b71b5-8672-4b27-8b2d-f674429f9e63"
id
"chatcmpl-21be4f65-13dd-9fe1-9ad8-7caf5bb021d4"
is_byok
false
latency
453
model_permaslug
"qwen/qwen3.5-flash-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772063501-7fI1svR5usrBN8Zm0HHe"
upstream_id
"chatcmpl-21be4f65-13dd-9fe1-9ad8-7caf5bb021d4"
total_cost
0.0044168
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
365
0.21%
Dialogue to Total Word Ratio
Ratio: 41.10%, Deviation: 11.10%
neededClean
false
wordsTotal
365
wordsDialogue
150
0.1060%