NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-17-01KHNFNHETXCP0T4BA1SPZCA7Z
Run Details
Model:
Minimax M2.5
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
2m 9s
Cost:
$0.012568
Provider:
Fireworks
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
51.89%
Dialogue to Total Word Ratio
(details)
75.9435%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly hunched over a crumpled map of the community center, tracing the path to the birthday ballroom. The room sparkled with streamers, a cake, and a bouncy castle that seemed to groan under the energy of six‑year‑olds. Bill twirled a sword he had lifted from the loot table, grinning like a kid who’d just unlocked a cheat code. He whispered that they would blend in with the mascots, pointing at a rubber duck wobbling near the snack table. He mentioned the clown held a key to the supply closet, and the DJ was looping a “Baby Shark” track. He noted the piñata would fall three minutes after the cake cutting, when children would swarm for candy and adults would chase crumbs. Bill nodded, already picturing a haul of action figures and gift bags. They practiced a switch: swapping empty boxes for real presents, slipping through the kitchen, and slipping out via the service elevator that smelled of frosting and mischief. The scheme was absurd, but the payoff was a mountain of toys that would make any collector weep with joy. Molly: I'll slip past the magician while he fumbles slowly. Bill: I'll grab loot when the kids rush for cake.
created_at
"2026-02-17T09:44:35.473554+00:00"
2/17/2026, 9:44:35 AM
model
"minimax/minimax-m2.5-20260211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2004
moderation_latency
(null)
generation_time
129307
tokens_prompt
77
tokens_completion
10865
native_tokens_prompt
97
native_tokens_completion
10449
native_tokens_completion_images
(null)
native_tokens_reasoning
10449
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0125679
router
(null)
provider_responses
0
id
"af0b22e3-7b63-4ecc-b187-0750514723e5"
status
200
is_byok
false
latency
2004
endpoint_id
"c61d37ec-745c-4efd-8129-a68371e056b2"
provider_name
"Fireworks"
model_permaslug
"minimax/minimax-m2.5-20260211"
api_type
"completions"
id
"gen-1771321345-URNBO66pzZ1IHhO8AJ2x"
upstream_id
"af0b22e3-7b63-4ecc-b187-0750514723e5"
total_cost
0.0125679
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Fireworks"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
51.89%
Dialogue to Total Word Ratio
Ratio: 1.00%, Deviation: 9.00%
neededClean
false
wordsTotal
200
wordsDialogue
2
75.9435%