NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBKK52V18G1ZYPDXSVTGRYP
Run Details
Model:
Qwen 3.5 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 55s
Cost:
$0.005158
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
34.0511%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The old sedan was parked illegally on the curb, its engine idling with a cough that sounded suspiciously like a dying animal inside the garage. Molly adjusted her oversized sunglasses while staring at the blueprints of the Miller estate spread across the dashboard like a map to treasure. But the treasure was toys. The plans were drawn on a napkin with crayon and the corners were sticky with sugar. Bill stood on the sidewalk adjusting a fake mustache that was glued to his upper lip with adhesive that was probably expired. He smelled like cheap cologne and anxiety. The night air smelled of grilling burgers and burning matches. The streetlights flickered above, casting long, dancing shadows across the cracked pavement. Molly tapped her fingers on the steering wheel rhythmically. The tires spun slightly on the asphalt as they waited for the signal. Shadows stretched across the windshield, obscuring the view of the house. Molly checked her watch one last time. Her reflection stared back at her from the dark glass. It looked frightened. The wind picked up, rustling the leaves near the curb. A car drove by, its headlights sweeping across the interior. Silence hung heavy in the air like a wet blanket. They sat still, breathing in unison to avoid detection. The house lights flickered inside the dark. A dog barked in the distance. The silence broke. Bill shifted in his seat. The old leather creaked loudly. She looked at the fake mustache on his face. The glue looked like it was peeling. The moon was hidden behind thick gray clouds. The streetlamps buzzed with electricity. Her mind raced with the consequences of failure. The plan seemed solid, but the execution was risky. Bill looked at the map. It was blurry. He wiped his hands on his pants. Molly tightened her grip on the wheel. She feared the police would arrive. The house was a fortress of joy. The neighbors were asleep in their beds. They prepared to move. The time had come. We have five minutes, Molly said. Perfect time for robbery, Bill stated confidently. We are stealing from children, Molly corrected sternly. Theft does not discriminate between ages or income levels. Wait, the gate has a camera, Molly said. We will block the lens with paper. I cannot pick that lock with a rubber ball. We have to rely on stealth and deception. We are wearing clown noses, Molly pointed out. It works for disguises in the movies. Movies are not real life. Real life is a simulation often. Molly sighed heavily. We need to stop and leave. We need to finish the job. If we do not finish the job, we get fired. Bill laughed nervously. We are not employees. We are partners. Partners get fired. Yes, let us go. Run, Bill shouted as the engine turned over. Go now. Go. Molly opened the door. The engine roared. They drove away quickly.
created_at
"2026-02-25T23:54:59.913Z"
2/25/2026, 11:54:59 PM
model
"qwen/qwen3.5-flash-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
451
moderation_latency
(null)
generation_time
73398
tokens_prompt
77
tokens_completion
12614
native_tokens_prompt
76
native_tokens_completion
12876
native_tokens_completion_images
(null)
native_tokens_reasoning
12275
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.005158
router
(null)
provider_responses
0
endpoint_id
"5f6b71b5-8672-4b27-8b2d-f674429f9e63"
id
"chatcmpl-e862ec85-7640-9c0a-b74f-3a186112cbe8"
is_byok
false
latency
451
model_permaslug
"qwen/qwen3.5-flash-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772063699-HCCFfYIE4bSzWqihwJw7"
upstream_id
"chatcmpl-e862ec85-7640-9c0a-b74f-3a186112cbe8"
total_cost
0.005158
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
486
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
486
wordsDialogue
0
34.0511%