NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ02GQHPJVW7DDMC3HGMT2X6
Run Details
Model:
Xiaomi MIMO v2.5 Pro
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
10.5s
Cost:
$0.001403
Provider:
Xiaomi
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
48.5316%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"The genius, Bill, is we don't need a getaway car. We blend in." "Blend in? We're stealing presents from a six-year-old. Our disguises are... clown noses?" "Exactly. No one suspects the clowns. Operation: Sugar Rush is a go." "Operation Sugar Rush? You named it?" "It builds team cohesion. Now, the presents are in a pile by the balloon arch. I create a diversion—I'll start a glitter-bombing incident. You, my friend, are the 'Sleigh of Shadows.'" "Please stop calling me that." "You wheel the presents out in a wagon. Covered with a blanket labeled 'Leftover Cake.' No one looks twice at leftover cake." "What if someone asks for a slice?" "Then you give them the stale donut in your pocket and run. It's flawless." "My pocket donut is for emergencies!" "This is an emergency, Bill. An emergency of injustice. Those kids didn't earn those Lego sets. We did." "...How did we earn them?" "Through superior planning. And my new clown shoes. They squeak. It's disarming." "This is the worst heist ever." "Which is why it's perfect. Now, to the party! Try not to look guilty."
created_at
"2026-04-24T15:42:44.288Z"
4/24/2026, 3:42:44 PM
model
"xiaomi/mimo-v2.5-pro-20260422"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1240
moderation_latency
(null)
generation_time
10363
tokens_prompt
77
tokens_completion
498
native_tokens_prompt
315
native_tokens_completion
431
native_tokens_completion_images
(null)
native_tokens_reasoning
168
native_tokens_cached
256
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0014032
router
(null)
provider_responses
0
endpoint_id
"5c24ecc9-5ba1-4af0-8877-3402889e7d0a"
id
"fa4b9bc3cdd0495e9b161bd514ba2f7a"
is_byok
false
latency
1158
model_permaslug
"xiaomi/mimo-v2.5-pro-20260422"
provider_name
"Xiaomi"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777045364-3jUa6ji7MtiA61LsH3mL"
session_id
(null)
api_type
"completions"
id
"gen-1777045364-I4e3mSpi68hPmlSabiN3"
upstream_id
"fa4b9bc3cdd0495e9b161bd514ba2f7a"
total_cost
0.0014032
cache_discount
0.0002048
upstream_inference_cost
0
provider_name
"Xiaomi"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
185
36.79%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
188
wordsDialogue
188
48.5316%