NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ03SMRP236X8KSK9S6KTJQD
Run Details
Model:
Xiaomi MIMO v2.5 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
26.4s
Cost:
$0.003506
Provider:
Xiaomi
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
1.68%
Dialogue to Total Word Ratio
(details)
0.8387%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded the blueprint across the hood of their dusty van, the paper fluttering in the chilly evening breeze. Below them, the brightly lit windows of the Henderson house glowed like a constellation of chaotic joy. The muffled sounds of "Happy Birthday" being butchered by a roomful of six-year-olds drifted up the hill. "Target rich environment," Molly murmured, tapping a pristine square labeled 'Gift Mountain.' "Estimated haul: two dozen wrapped boxes, ranging from Lego Death Stars to, if our intel is correct, a genuine vintage Easy-Bake Oven." Bill, polishing a tiny grappling hook with his sleeve, snorted. "Our intel is a five-year-old named Timmy you paid in lollipops." "Timmy is a reliable source," Molly said, her tone clipped and professional. "He confirmed the parents are distracted, the dog is friendly, and the back door sticks. We go in the back, quiet as ghosts, clean out the pile before the cake is even cut." "See, that's where you lose me," Bill said, holding the hook up to the van's dome light. "All this stealth. It's a birthday party, not a museum heist. Why don't we just send in a clown? Distract 'em with balloon animals and walk out the front door with the goods." "Because, Bill, we have dignity," Molly said. "And a criminal record. A clown application asks for references." She pointed to a dotted line on the blueprint. "Window. Bedroom. Unlocked according to Timmy. We enter here, move through the house to the living room, and extract. Simple." "What if someone needs the bathroom?" Bill asked, suddenly serious. "Then they use the hall bathroom, which is on the opposite side of the house," Molly replied, patient as a saint. "We have a three-minute window during 'Pin the Tail on the Donkey.' All eyes on the blindfolded kid." Bill finally looked down at the house, a grin spreading across his face. "Alright. But if I see an Easy-Bake Oven, I'm calling dibs." "Negative. All proceeds are split equally after fencing," Molly said, rolling up the blueprint. "This is a job, not a grab. Remember the plan." Ten minutes later, the bedroom window slid up with a faint groan. Molly slipped inside first, landing silently on a rug shaped like a turtle. Bill followed, knocking a stack of picture books to the floor with a muffled thump. They froze. From below, the song changed to a riotous rendition of 'If You're Happy and You Know It.' They were clear. Molly motioned, and they crept into the dim hallway. The air smelled of frosting and old pennies. As they neared the living room archway, the scene unfolded before them: a whirlwind of children, a table groaning under a cake shaped like a volcano, and there, in the corner, a glorious, mountainous pile of colorfully wrapped boxes. Bill’s eyes widened. "Look at the size of that stack." "Focus," Molly whispered. "On my signal. They're about to—" A shriek from the living room cut her off. A child in a donkey tail costume had accidentally stuck the pin to the birthday boy’s new sweater. Pandemonium erupted. Parents swarmed. It was chaos. "Now," Molly hissed. They darted forward, two shadows against the bright room. In thirty seconds of swift, silent work, they had the pile, an armful each of crinkly paper and potential toys. They were back in the hallway, hearts pounding, when a small voice spoke from the top of the stairs. "Did you bring more presents?" They looked up. Timmy, the informant, stood in dinosaur pajamas, holding a half-eaten cookie. Molly froze. Bill shifted his load nervously. "Uh, no, buddy. We're just... taking them for cleaning." Timmy nodded, accepting this. "Can I have my lollipop now?" "Later," Molly breathed, already moving toward the bedroom. "We have to go. The plan is blown." "But we got the stuff," Bill whispered, struggling with a box that had a bicycle wheel. As they scrambled back out the window, the sound of the party swelling behind them, Bill glanced at Molly. "Dibs on the oven." "We'll see," she said, and for the first time that night, she smiled.
created_at
"2026-04-24T16:05:04.937Z"
4/24/2026, 4:05:04 PM
model
"xiaomi/mimo-v2.5-pro-20260422"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1633
moderation_latency
(null)
generation_time
26303
tokens_prompt
77
tokens_completion
1297
native_tokens_prompt
315
native_tokens_completion
1132
native_tokens_completion_images
(null)
native_tokens_reasoning
197
native_tokens_cached
256
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0035062
router
(null)
provider_responses
0
endpoint_id
"5c24ecc9-5ba1-4af0-8877-3402889e7d0a"
id
"a11cea577e684643a7fdca240a40d48d"
is_byok
false
latency
1552
model_permaslug
"xiaomi/mimo-v2.5-pro-20260422"
provider_name
"Xiaomi"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777046704-7VHo2scfIvscTx3T1nyI"
session_id
(null)
api_type
"completions"
id
"gen-1777046704-cegYff5YOjUUW8X2SyWD"
upstream_id
"a11cea577e684643a7fdca240a40d48d"
total_cost
0.0035062
cache_discount
0.0002048
upstream_inference_cost
0
provider_name
"Xiaomi"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
681
1.68%
Dialogue to Total Word Ratio
Ratio: 39.04%, Deviation: 9.04%
neededClean
false
wordsTotal
689
wordsDialogue
269
0.8387%