NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ099MF1GB0RC80QRW74CN51
Run Details
Model:
Xiaomi MIMO v2.5 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
21.0s
Cost:
$0.002918
Provider:
Xiaomi
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.3562%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly’s apartment smelled of burnt toast and ambition. Bill, perched precariously on a wobbly stool, peered at the blueprint spread across her kitchen table. It was less a blueprint and more a crayon-scribbled map of the Henderson backyard, complete with a stick-figure pirate and a violently pink bouncy castle. “Operation Piñata,” Molly announced, tapping a rickety drawing of the garage. “Entry point: the service door behind the recycling bins. Code’s on the notepad by the fridge. It’s ‘1234’ because Mr. Henderson thinks he’s clever.” Bill squinted. “The whole payload is in the living room. Unsecured.” “Exactly. Which is why we go during the sing-along. Maximum distraction. I’ll handle the gift table; you’re on lookout and decoy duty.” “What’s the decoy?” Molly grinned, revealing a slightly crooked canine tooth. “You, dressed as the world’s most disoriented clown. There’s a costume bag by the door. Big shoes, red nose, the works. Your job is to cause a minor, non-traumatic incident near the fence to draw parental eyes.” Bill sighed, rubbing his temples. “I was hoping for laser grids and pressure plates. Not pretending to trip over a garden gnome.” “The gnome is key,” she said, with unnerving sincerity. “It’s plastic, but it *looks* heavy. Your stumble must convey comedic confusion, not malice. We’re thieves, not monsters.” “The presents,” Bill said, steering back. “There must be thirty of them. We can’t carry thirty wrapped boxes.” “We won’t.” Molly produced a giant, sparkly gift bag with a tissue-paper meteor erupting from the top. “This is our Trojan Horse. It’s mostly empty. We swap the real presents for identical bags filled with… these.” She held up a handful of gravel-stuffed balloons. “Noisier, sure, but by the time little Timmy unwraps a bag of rocks, we’re two towns over.” “Diabolical,” Bill admitted. “But the clown suit?” “A non-negotiable. My role requires stealth. Yours requires… flair.” She stood, all business. “The party starts at two. Sing-along at three-fifteen. We breach at three-sixteen. Simple.” Bill looked at the crayon map, then at the glittering, deflated bag of deceit. He picked up a single, wrapped present from the table—a practice prop. It felt strangely weighty. “What if there’s a guard dog?” “Henderson has a hamster named Captain Fluffington. He’s in the living room, on a wheel. He might slow us down with his tiny judgmental eyes, but I’ve planned for that too.” She pointed to a small Ziploc bag of sunflower seeds in her pocket. As Molly began assembling her tools with the quiet precision of a watchmaker, Bill couldn’t help but feel a pang of something between awe and profound existential dread. They were about to ruin a seven-year-old’s birthday for a haul of action figures and board games, and she was treating it like the Caper of the Century. He picked up the clown nose. It squeaked. The plan was insane, but the rent was due, and Captain Fluffington’s stern gaze was a risk he was, apparently, willing to take.
created_at
"2026-04-24T17:41:11.801Z"
4/24/2026, 5:41:11 PM
model
"xiaomi/mimo-v2.5-pro-20260422"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1434
moderation_latency
(null)
generation_time
20940
tokens_prompt
77
tokens_completion
1108
native_tokens_prompt
315
native_tokens_completion
936
native_tokens_completion_images
(null)
native_tokens_reasoning
222
native_tokens_cached
256
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0029182
router
(null)
provider_responses
0
endpoint_id
"5c24ecc9-5ba1-4af0-8877-3402889e7d0a"
id
"df2477c4d8c54fdcb7c38e4e056997bb"
is_byok
false
latency
1367
model_permaslug
"xiaomi/mimo-v2.5-pro-20260422"
provider_name
"Xiaomi"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777052471-0TdD4xVM5qEb1QGdNmLp"
session_id
(null)
api_type
"completions"
id
"gen-1777052471-7GKUv6IIrc9z9BM0Dnrx"
upstream_id
"df2477c4d8c54fdcb7c38e4e056997bb"
total_cost
0.0029182
cache_discount
0.0002048
upstream_inference_cost
0
provider_name
"Xiaomi"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
494
0.00%
Dialogue to Total Word Ratio
Ratio: 50.99%, Deviation: 20.99%
neededClean
false
wordsTotal
506
wordsDialogue
258
49.3562%