NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ03TPBVTZS1TQ60SFF0WPAT
Run Details
Model:
Xiaomi MIMO v2.5 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
19.9s
Cost:
$0.002792
Provider:
Xiaomi
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0008%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The garage smelled of motor oil and desperate ambition. Spread across the workbench, between a coil of extension cords and a box of last year’s Christmas lights, was a sprawling, crayon-colored diagram labeled “Operation Tinsel.” “Are we sure about this, Molly?” Bill peered at the map, his brow furrowed. He traced a thick, wobbly line with his finger. “The ‘Parental Distraction Zone’… it’s just a circle drawn around a bowl of guacamole.” Molly didn’t look up from adjusting her binoculars. “It’s not about the guacamole, Bill. It’s about the *concept*. Mrs. Henderson makes her seven-layer dip with suspiciously fresh avocados. It’s a beacon. They’ll swarm. While they’re debating the cilantro ratio, we secure the primary objective.” “The primary objective,” Bill repeated, his voice hollow. “The pile of presents.” “Exactly. In and out. Clean. We’re ghosts. We’re yuletide whispers.” Bill sighed, poking at a crudely drawn cartoon penguin on the map. “It’s a five-year-old’s birthday party. For a child named Timmy. Do you think his parents will call the CIA?” “Never underestimate a parent with a gift for their only child,” Molly said, finally turning. Her eyes were alight with the thrill of the chase. “This is about professionalism. Efficiency. We’ve cased the location. We know the cake arrives at 3:00 PM, causing a strategic sugar-rush at approximately 3:45. The piñata—a donkey, likely—will be deployed around 4:00, creating a localized frenzy. Our window is the four-minute chaos between ‘crack!’ and ‘got a gummy bear!’ That’s when we make the switch.” “The switch,” Bill murmured, looking at the third part of the plan. It showed two large, innocuous-looking gift bags. “We’re not just stealing them. We’re leaving decoy bags full of… of broccoli?” “Brussels sprouts,” Molly corrected. “Individually wrapped in shiny paper. It’s a psychological blow. They expect toy cars and dinosaurs. They get a healthy cruciferous vegetable. It demoralizes. It creates confusion. It buys us valuable egress time.” Bill picked up a single, shiny brussels sprout from a nearby bowl. “This is cruel.” “This is art,” Molly countered, packing her gear with practiced movements. “Now, your role is crucial. You are the ‘Well-Meaning Uncle.’ You circulate with the punch bowl. Your job is to look harmless, engage in small talk about lawnmowers, and, when the piñata falls, you shout, ‘Oh, what a mess!’ as loudly as possible. Can you do that?” Bill puffed out his chest. “I can say ‘lawnmower’ with the best of them.” “I know you can.” Molly clapped him on the shoulder. “Remember, Bill. No presents under the tree tomorrow morning. Just the haunting memory of sprouts.” They synchronized their watches. The clock above the door buzzed. It was time to go steal Christmas from a kindergartener.
created_at
"2026-04-24T16:05:39.333Z"
4/24/2026, 4:05:39 PM
model
"xiaomi/mimo-v2.5-pro-20260422"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1251
moderation_latency
(null)
generation_time
19835
tokens_prompt
77
tokens_completion
1028
native_tokens_prompt
315
native_tokens_completion
894
native_tokens_completion_images
(null)
native_tokens_reasoning
212
native_tokens_cached
256
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0027922
router
(null)
provider_responses
0
endpoint_id
"5c24ecc9-5ba1-4af0-8877-3402889e7d0a"
id
"3e919ba7e69e48e3a09cfd39af6a1bac"
is_byok
false
latency
1242
model_permaslug
"xiaomi/mimo-v2.5-pro-20260422"
provider_name
"Xiaomi"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777046739-15oONwn8PED5xmY4qqxa"
session_id
(null)
api_type
"completions"
id
"gen-1777046739-CKvA67LtEwIN9wlzrS6l"
upstream_id
"3e919ba7e69e48e3a09cfd39af6a1bac"
total_cost
0.0027922
cache_discount
0.0002048
upstream_inference_cost
0
provider_name
"Xiaomi"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
451
0.00%
Dialogue to Total Word Ratio
Ratio: 64.85%, Deviation: 14.85%
neededClean
false
wordsTotal
458
wordsDialogue
297
0.0008%