NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-17-01KHNEV5H8M52J7GPPN2D83RE9
Run Details
Model:
Minimax M2.5
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
6m 51s
Cost:
$0.027484
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
1.99%
Dialogue to Total Word Ratio
(details)
50.3488%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly leaned against the community center's peeling brick, neon sign flickering above weekend events. She tapped a glossy flyer advertising 'Totally Awesome Super Kidz Party – Bring a Present, Get a Cake!' She smirked, then turned to Bill, who hunched over a hand‑drawn map, a half‑eaten donut in his other hand. The hallway was strung with streamers and a balloon arch that looked like a rainbow exploded. A banner shouted “Happy Birthday, Emma!” in glittery letters that cost more than gifts they planned to swipe. Schedule said cake at three, presents piled like a mountain ready to be plundered. Molly checked her watch and muttered, 'We have exactly ninety minutes to get in, grab the loot, and get out before the kids start crying.' Bill licked powdered sugar from his fingers and whispered, 'The back door’s alarm is a joke, I can disable it with a paperclip and a prayer.' "Molly: If we’re going in, we need a disguise that makes the kids think we belong, not a cop with a badge." "Bill: Clowns are out—those painted faces scream ‘suspicious,’ plus the last birthday party we crashed ended in a pie‑filled riot." "Molly: What if we pretend to be the entertainment? A quick puppet show could let us slide in unnoticed." "Bill: I love the puppet angle. We can stash a duffel under the stage while the toddlers giggle at the squeaky mouse." "Molly: Perfect. Now we need a diversion—the moment the cake appears, the kids will swarm like bees to honey." "Bill: I’ve got a plan for that. We rig a confetti cannon inside the frosting—once it pops, chaos reigns." "Molly: While they’re scrambling for glitter, we swoop in, grab the presents, and slip through the kitchen door." "Bill: The kitchen door’s alarm is a joke—I can disable it with a paperclip and a prayer, just like I said." "Molly: Then we head to the back alley where the old ice‑cream truck waits, disguised as a delivery van." "Bill: Perfect. We'll toss the loot into the van, hop in, and speed off before anyone notices the presents vanished." "Molly: First, we grab the biggest gift—the shiny red bike—then the game console, and don’t forget the goody bags." "Bill: I call dibs on the console, but I’ll settle for the stack of board games if the bike is too heavy." "Molly: We split the loot 70/30—my crowbar does the heavy lifting, you get the easy haul." "Bill: Fine, 70/30. Meet me at the back door at 2:45, right after the cake cutting, when the kids are sugar‑crazed." "Molly: If anything goes sideways, we have a signal—shout ‘Pizza!’ and we’ll abort, pretending it’s a treat." "Bill: That’s the dumbest escape plan I’ve ever heard, but it might just work—let’s roll." "Bill: Ready when you are." They slipped on fake mustaches, checked the timing on a battered phone, and slipped into the shadows, their hearts thudding like drumbeats in a cartoon heist. The streetlights buzzed like eager audience members in the dark.
created_at
"2026-02-17T09:34:52.957165+00:00"
2/17/2026, 9:34:52 AM
model
"minimax/minimax-m2.5-20260211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2920
moderation_latency
(null)
generation_time
410503
tokens_prompt
77
tokens_completion
21351
native_tokens_prompt
101
native_tokens_completion
22878
native_tokens_completion_images
0
native_tokens_reasoning
22190
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0274839
router
(null)
provider_responses
0
id
"8f543901987c03726651c52ecf6cea8d"
status
200
is_byok
false
latency
2920
endpoint_id
"05387b87-6e10-4520-87e9-be7c49693ba1"
provider_name
"Novita"
model_permaslug
"minimax/minimax-m2.5-20260211"
api_type
"completions"
id
"gen-1771320481-DxxfBlGqSBZDqx9GjjJY"
upstream_id
"8f543901987c03726651c52ecf6cea8d"
total_cost
0.0274839
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
506
1.99%
Dialogue to Total Word Ratio
Ratio: 78.85%, Deviation: 8.85%
neededClean
false
wordsTotal
506
wordsDialogue
399
50.3488%