NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBF00SDZGACBMWXSXQ9BBCP
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 5s
Cost:
$0.029176
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
11.77%
Dialogue to Total Word Ratio
(details)
5.8832%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the heavy night-vision goggles sliding down her nose, the rubber strap digging into her forehead. She peered through the tinted window of the van, scanning the quiet suburban street for any movement. “Are you sure? It feels wrong.” Bill checked his digital watch, tapping the screen aggressively with a grease-stained finger. “Positive. Seven-year-olds mean high-value toys.” “It’s a suburban house. We’re stealing plastic dinosaurs.” “Not just dinosaurs. The glow-in-the-dark ones.” Molly sighed, tapping the steering wheel rhythmically with her knuckles, the sound echoing in the small space. “We’re the Ghosts of the Gold Vault.” “Tonight we become the Ghosts of the Playroom.” “That sounds like a nursery rhyme.” “It’s catchy. Think of the branding.” Molly rubbed her temples, feeling a headache coming on behind her eyes, a familiar sensation. “The alarm is a motion sensor.” “I brought a laser cutter.” “For a wooden door?” “It makes a cool noise.” “What about the parents?” “Cake is involved. Cake is a distraction.” “What if they aren’t eating cake?” “Then we use the piñata strategy.” “There is no piñata strategy.” “We smash the piñata, they look up, we grab the loot.” Molly stared at him, her expression deadpan and unamused, waiting for a logical explanation. “You’re insane.” “I’m efficient. Target is the red truck.” “We can’t drive a truck out of a house.” “We can if we put it in a bag.” “That defeats the purpose.” “Details, Molly.” She grabbed the heavy duffel bag from the back seat, zipping it shut with a sharp sound. “We go in at midnight.” “Midnight is too late. Parents might wake.” “So when?” “During the dance party. They’re busy jumping.” “They’ll notice us with a pile of Legos.” “We’ll wear costumes.” “What kind?” “Clowns. Everyone loves clowns.” “Nobody loves clowns. They fear them.” “Exactly. Fear keeps them still.” Molly shook her head slowly, the van creaking around them in the stillness of the night. “This is the worst plan.” “But it’s the only one we have.” She looked at the house through the window. Lights flickered inside. Music played faintly. “If we get caught, I blame you.” “I’ll say you’re the mastermind.” “I am the mastermind.” “Sure thing, boss.” Molly opened the van door, the hinges creaking loudly into the silence, announcing their departure. “Let’s go get the toys.” Bill grabbed his laser cutter, checking the battery light, ensuring it was fully charged. “For the glory.” “For the rent money.” They stepped out into the cool night air, ready for the heist of the century.
created_at
"2026-02-25T22:33:57.813Z"
2/25/2026, 10:33:57 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
420
moderation_latency
(null)
generation_time
64963
tokens_prompt
77
tokens_completion
8689
native_tokens_prompt
76
native_tokens_completion
9108
native_tokens_completion_images
(null)
native_tokens_reasoning
8464
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.029176
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-8c5af3f7-c35d-915a-952e-06d721fc8bc5"
is_byok
false
latency
420
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772058837-z6otoCI60hv3RQ16y5Ea"
upstream_id
"chatcmpl-8c5af3f7-c35d-915a-952e-06d721fc8bc5"
total_cost
0.029176
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
420
11.77%
Dialogue to Total Word Ratio
Ratio: 56.54%, Deviation: 6.54%
neededClean
false
wordsTotal
428
wordsDialogue
242
5.8832%