NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-17-01KHNETT9DH7CHWBG5KNVY5SJR
Run Details
Model:
Minimax M2.5
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 6s
Cost:
$0.016061
Provider:
SiliconFlow
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
66.69%
Dialogue to Total Word Ratio
(details)
33.3452%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Outside the modest duplex, streetlights sputtered like tired fireflies, casting erratic shadows on the cracked sidewalk. Inside, the cramped living room reeked of instant coffee and ambition, as Molly and Bill hunched over a folding table covered in flyers, a map, and a half‑eaten pizza. The Hendersons' annual children's birthday bash was the perfect camouflage: a clown, a magician, a cascade of balloons, and a mountain of brightly wrapped presents stacked beneath a glittering banner. Sir Bark‑a‑lot, the neighbor's yappy terrier, would be too busy chasing squirrels to notice a couple of extra ‘entertainers’ slipping through the side gate. The security camera at the front gate had a blind spot near the inflatable obstacle course, offering a narrow window of opportunity before the parents called the kids to the cake. Molly tapped the map with a red pen, circling the pile of gifts, while Bill practiced his best innocent grin in the dusty mirror. Molly: "Alright, Bill, here's the plan. We pose as the entertainment." Bill: "You want me to be a clown again? Remember the incident with the glitter bomb." Molly: "No, we'll be magicians. Kids love a good rabbit-out-of-hat." Bill: "And the parents? They hover like vultures around the cake." Molly: "We'll slip a harmless dye in the lemonade. It’ll turn their tongues blue, and they'll run to the bathroom." Bill: "Blue tongues, really? That’s the most ridiculous distraction I’ve ever heard." Molly: "It’ll buy us five minutes. While they’re freaking out, we grab the gift pile." Bill: "What about Sir Bark‑a‑lot? That terrier has a sixth sense for trouble." Molly: "We’ll toss a squeaky toy to the left. He’ll chase it, and we slip right." Bill: "The security camera at the front gate—any blind spots?" Molly: "There’s a gap near the bouncy castle. The angle is too wide for the lens." Bill: "Perfect. We’ll slip in through the side gate, act casual, and head straight for the presents." Molly: "I’ll wear the magician’s cape. It has extra pockets for quick grabs." Bill: "And I’ll bring a fake ‘gift’ box with a false bottom. We can stash the loot inside." Molly: "We need a signal. When I pull the rabbit from the hat, that’s our cue." Bill: "What if the rabbit actually jumps? It might cause a panic." Molly: "That’s the panic we need. While everyone’s laughing, we swipe." Bill: "And the kids? We don’t want any tears. We’ll leave a glittery ‘thank you’ note." Molly: "Exactly. They’ll think the presents were a gift from the magician. Pure generosity." Bill: "Alright, let’s rehearse the swap. I’ll pretend to hand you a wand, you grab a box." Molly: "Remember, light fingers, not heavy hands. The gifts are already wrapped, so don’t crush them." Bill: "Got it. We’ll be as gentle as a ninja stealing a cookie from a toddler." Molly: "The time is set. We go in at 3:15, just as the cake is being cut." Bill: "And we’ll be out by 3:40, before the parents start the photo frenzy." Molly: "Let’s do a quick walk‑through tomorrow. Check the gate, the camera, the dog." Bill: "And grab a coffee. I’ll need my energy for the heist of the century."
created_at
"2026-02-17T09:29:56.81548+00:00"
2/17/2026, 9:29:56 AM
model
"minimax/minimax-m2.5-20260211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
786
moderation_latency
(null)
generation_time
125909
tokens_prompt
77
tokens_completion
13230
native_tokens_prompt
98
native_tokens_completion
13360
native_tokens_completion_images
(null)
native_tokens_reasoning
12596
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0160614
router
(null)
provider_responses
0
id
"019c6aed6af2823628eda0e8c2d2a760"
status
200
is_byok
false
latency
786
endpoint_id
"e89b11ef-ca0e-4cc1-9333-69343ec25819"
provider_name
"SiliconFlow"
model_permaslug
"minimax/minimax-m2.5-20260211"
api_type
"completions"
id
"gen-1771320469-jI8L9UmVJmvtGrnkoctF"
upstream_id
"019c6aed6af2823628eda0e8c2d2a760"
total_cost
0.0160614
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
534
66.69%
Dialogue to Total Word Ratio
Ratio: 72.85%, Deviation: 2.85%
neededClean
false
wordsTotal
534
wordsDialogue
389
33.3452%