NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPFEJY41JV9HAC2S2YA5Q3
Run Details
Model:
Mistral Large 3
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
15.8s
Cost:
$0.001140
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
41.57%
Dialogue to Total Word Ratio
(details)
70.1433%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the glittery unicorn horn strapped to her forehead and squinted at the inflatable bounce house sagging under the weight of a dozen sugar-crazed six-year-olds. "You’re telling me we’re hitting a *princess-themed* heist? I didn’t sign up to be outshone by a five-year-old in a tutu." Bill, already halfway into a neon pink "Happy Birthday, Emma!" T-shirt two sizes too small, shot her a look. "You’re the one who said we needed a low-risk job. What’s lower risk than a bunch of kids who think we’re just the ‘fun aunt and uncle’ here to pass out juice boxes?" Molly yanked the shirt over her head, wincing as the sequins scratched her neck. "Low risk? Bill, we’re about to commit grand larceny in broad daylight while dressed like rejected extras from a My Little Pony reboot. If this goes south, we’re getting arrested by a cop who’s *also* in a princess costume." "Relax. Kids are the best alibis. No one suspects the guy handing out balloon animals." Bill twisted a long, twisty balloon into what was supposed to be a poodle but looked more like a sad, deflating slug. "Besides, have you *seen* the loot? That pile of presents is bigger than my last three jobs combined." Molly eyed the gift table, where a mountain of beribboned boxes teetered precariously. "Yeah, and half of it’s probably finger paintings and homemade slime. What’s the resale value on a macaroni necklace?" "Focus on the big-ticket items. The parents went all out—iPads, gift cards, that drone I saw in the corner. We grab the electronics, ditch the glitter bombs, and we’re out before the piñata even gets cracked." Molly groaned as a toddler in a dinosaur costume toddled past, clutching a half-eaten cupcake. "I hate kids. They’re sticky, they cry for no reason, and they *always* ask if I’m a real princess." "Just bat your eyelashes and say ‘of course, darling’ in a British accent. Works every time." Bill tossed her a handful of temporary tattoos. "Now slap these on and start mingling. We’ve got twenty minutes before the magician shows up and ruins our distraction." Molly peeled the backing off a glittery butterfly tattoo and pressed it onto her forearm. "You realize if we get caught, our mugshots are gonna be *adorable*, right? ‘Wanted: Two Career Criminals Last Seen in Tiaras.’" Bill grinned, adjusting the oversized bowtie choking his neck. "Hey, if we’re going down, we’re going down in style. Now go charm the birthday girl before she starts demanding we do the Macarena." Molly sighed, smoothing down her tulle skirt. "Fine. But if I have to sing ‘Let It Go’ one more time, I’m taking the drone *and* your cut." Bill clapped her on the back. "That’s the spirit. Now move—here comes the mom with the cake." Molly pasted on a smile as a harried woman in a "World’s Best Mom" sash approached, holding a sparkler-topped monstrosity. "Happy birthday, sweetie! Who’s ready for the *best day ever*?"
created_at
"2026-02-20T14:15:21.451Z"
2/20/2026, 2:15:21 PM
model
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
447
moderation_latency
(null)
generation_time
15685
tokens_prompt
77
tokens_completion
770
native_tokens_prompt
69
native_tokens_completion
737
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
48
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00114
router
(null)
provider_responses
0
endpoint_id
"fa718841-07ea-4b7e-b8a3-6069545c7e6a"
id
"c276bccf7e3647dd846f92e7ba71e8d8"
is_byok
false
latency
447
model_permaslug
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771596921-FXqNzlJRonoHvVBFBZ2Z"
upstream_id
"c276bccf7e3647dd846f92e7ba71e8d8"
total_cost
0.00114
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
506
41.57%
Dialogue to Total Word Ratio
Ratio: 65.81%, Deviation: 4.19%
neededClean
false
wordsTotal
506
wordsDialogue
333
70.1433%