NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ03F556158YE6T7JDR3VGXB
Run Details
Model:
Xiaomi MIMO v2.5 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
26.3s
Cost:
$0.003020
Provider:
Xiaomi
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.02%
Dialogue to Total Word Ratio
(details)
50.0090%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The basement smelled of damp concrete and broken dreams. Molly pinned a map of the community center to the corkboard, the ‘Party Room’ circled in red glitter pen. “Alright, Bill. Operation Birthday Smash. We go in during the chaos of the piñata. The presents will be piled on this table here.” She tapped the map. “We need a distraction big enough to make twenty kids, three parents, and a clown named Giggles look away for ninety seconds.” Bill, polishing a magnifying glass he’d ‘borrowed’ from a museum, sniffed. “What’s the haul?” “Specs suggest a haul of unprecedented mediocrity. A ‘My First Chemistry Set’—already opened, a ‘World’s Okayest Gamer’ t-shirt, and approximately thirty-seven small, wrapped parcels that will contain either socks or crayons. But it’s the principle, Molly. We are the apex predators of this plastic jungle.” “The clown is the variable,” Molly said, chewing her lip. “Giggles has been professionally trained. He has a seltzer bottle that’s registered as a lethal weapon in three states.” “Right. So the classic ‘falling down the stairs’ gag is out.” Bill snapped his fingers. “What about the bounce house? We get everyone outside. I’ll sabotage the air pump. A slow, tragic deflation. It’ll be like watching the Hindenburg, but with more screaming and unicorn patterns.” “Too messy. The presents are inside.” Molly paced. “We need to lure the clown away. What’s his weakness?” “Children,” Bill said flatly. “They’re his whole deal.” “Exactly! So we give him the ultimate challenge. A child he can’t possibly entertain.” She grinned. “You, Bill. You will be our Trojan Horse.” Bill blinked. “I’m forty-two.” “You’re forty-two with a heart of gold and a face that’s… uniquely aerodynamic. We dress you in a diaper. We put a pacifier in your mouth. You are a giant, babbling baby wrecking the party. He won’t be able to resist you.” “This is a new low, Molly. And we once stole a statue of a badger from a mayor’s garden.” “Exactly. This is our magnum opus.” She handed him a plush bunny. “Practice your ‘goo-goo’s. The skill set is non-negotiable.” “And the exit?” “Same as always. We slip out with the presents in the giant ‘Pin the Tail on the Donkey’ poster tube. The birthday boy gets the blame. A tale as old as time.” Bill sighed, tucking the bunny under his arm. “I’ll need extra padding for the diaper. My knees aren’t what they were.” “Deal. Now, remember the code words. ‘I need more cake’ means the coast is clear. ‘Does anyone have a wet wipe?’ means we abort.” “And ‘The piñata is a lie’?” “That’s just a general life observation,” Molly said, clapping him on the shoulder. “Get some rest, big baby. Tomorrow, we make history. Or at least, we make off with twelve dollars worth of used toys.” Bill looked at the map, then at his reflection in a dusty shard of mirror. “Just so we’re clear, if Giggles tries to change me, I’m out.” “Noted. Now go practice your tantrum.”
created_at
"2026-04-24T15:59:21.266Z"
4/24/2026, 3:59:21 PM
model
"xiaomi/mimo-v2.5-pro-20260422"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2625
moderation_latency
(null)
generation_time
26164
tokens_prompt
77
tokens_completion
1148
native_tokens_prompt
315
native_tokens_completion
970
native_tokens_completion_images
(null)
native_tokens_reasoning
254
native_tokens_cached
256
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0030202
router
(null)
provider_responses
0
endpoint_id
"5c24ecc9-5ba1-4af0-8877-3402889e7d0a"
id
"1fec6e061afc4753bcd0cc5e32ec8d59"
is_byok
false
latency
2622
model_permaslug
"xiaomi/mimo-v2.5-pro-20260422"
provider_name
"Xiaomi"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777046361-rw9u5p953395te29isIM"
session_id
(null)
api_type
"completions"
id
"gen-1777046361-aSyAmnzqfxpB7RaBsFcs"
upstream_id
"1fec6e061afc4753bcd0cc5e32ec8d59"
total_cost
0.0030202
cache_discount
0.0002048
upstream_inference_cost
0
provider_name
"Xiaomi"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
0.02%
Dialogue to Total Word Ratio
Ratio: 83.13%, Deviation: 13.13%
neededClean
false
wordsTotal
504
wordsDialogue
419
50.0090%