NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1HMKG3GZDR49CSFNXBKHQ8
Run Details
Model:
MiniMax M3
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 2s
Cost:
$0.002064
Provider:
Minimax
Evaluation summary
Score
Evaluator
Details Link
0.01%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0049%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the garden gnome, binoculars raised. Bill adjusted his fake mustache and dropped down beside her, his elbow sinking into a flower bed. "Okay, Bill. Talk to me about the target." "It's a birthday party, Molly. Six-year-olds. Hot dogs. The works." "And the perimeter?" "Pink and purple balloons tied to the mailbox. One piñata shaped like a dinosaur that's already missing an eye." "Tragic backstory?" "Probably got it at Party City on clearance." "Hostiles?" "Three parents. One is wearing a shirt that says 'Birthday Star Dad' and looks like he could benchpress the bouncy castle." "Civilians?" "Twenty-seven. Most under four feet tall. High risk of sticky fingers and sudden emotional outbursts." "Entry points?" "Back gate. Kitchen window. Or we just walk in the front like we belong, because, you know, we do look like parents." "Too risky. You forgot the safe word." "Macaroni art." "Correct. Now, the objective. We're not here for the cake." "Nope." "Not here for the goody bags." "Definitely not." "We're here for the haul. The big one." "The presents table." "Twenty-three unwrapped gifts and counting, Bill. Unwrapped, Bill. It's going to be a candy store for a professional like you." "What about the entertainment? That one clown with the accordion?" "Baby Shark on repeat for forty-five minutes. It's our cover. By the third loop, every adult here will be brain-dead. We move during the cake cutting." "And if the birthday kid cries?" "We pivot to Plan B." "What's Plan B?" "We don't have a Plan B, Bill. I just like saying it. Makes me feel like a real criminal. Okay, what about the getaway?" "I've got the van idling two blocks away. License plate says 'BIRTHDAY BOY' because the guy who sold it to me is a monster." "Backup?" "My mom. She's in the car with a crossbow and a thermos of chamomile tea." "Is she on board?" "She's on board with whatever I'm on board with. That's the deal." "Good. Now, the disguise. You're wearing the 'Hi, I'm A Parent' lanyard, right?" "And the polo with the popped collar. I look like a divorced dad from the 90s." "Perfect. Unapproachable. Invisible." "What's my role once inside?" "You distract. You corner the nearest parent and start talking about cryptocurrency. Adults can't stand it. They'll throw their kids at you just to get away." "And you?" "I mingle. I compliment the fruit platter. I ask the host where she got the bunting. Meanwhile, the presents slide into a diaper bag I sewed myself. Five pockets, Bill. Reinforced stitching. It can hold a Lego Millennium Falcon and a small dog." "Stitching rated for pressure?" "For a 7-year-old tugging on it, absolutely." "What's the exit?" "We leave the second someone yells 'piñata time.' The chaos will be our friend." "How do I know when to jump in?" "Bill, you're a professional." "I'm a physical therapist who got fired for stealing tongue depressors, Molly." "Exactly. You're a professional. Now, one more time. What's the plan?" "We hit the party like it's the Louvre. Confident, stylish, and we absolutely do not accept juice boxes from any child under ten." "And if they offer?" "We decline. It's how they get you."
created_at
"2026-06-01T12:14:13.158Z"
6/1/2026, 12:14:13 PM
model
"minimax/minimax-m3-20260531"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
939
moderation_latency
(null)
generation_time
68853
tokens_prompt
77
tokens_completion
2037
native_tokens_prompt
236
native_tokens_completion
1705
native_tokens_completion_images
(null)
native_tokens_reasoning
1160
native_tokens_cached
222
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00206352
router
(null)
provider_responses
0
endpoint_id
"6851b352-a7e2-45ad-a69e-67ed1208a416"
id
"066caa958bbe4768f23a017cefd8ebf9"
is_byok
false
latency
938
model_permaslug
"minimax/minimax-m3-20260531"
provider_name
"Minimax"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780316053-bNYFqCEp8gcCwmBzNEve"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780316053-fc6WAvxSN4EyiUjnxRHY"
upstream_id
"066caa958bbe4768f23a017cefd8ebf9"
total_cost
0.00206352
cache_discount
0.00010656
upstream_inference_cost
0
provider_name
"Minimax"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.01%
Matches word count
n/a
neededClean
false
words
531
0.00%
Dialogue to Total Word Ratio
Ratio: 95.36%, Deviation: 25.36%
neededClean
false
wordsTotal
539
wordsDialogue
514
0.0049%