NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPNA6CTGDEMKDKQ9FQR6VS
Run Details
Model:
Mistral Large 3
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
15.2s
Cost:
$0.001182
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
20.19%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
10.0948%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The fluorescent lights of the diner buzzed like a swarm of angry bees, casting a sickly glow over the chipped Formica table where Molly and Bill sat. A half-eaten plate of fries congealed between them, forgotten in favor of the blueprint spread across the table. Molly tapped a manicured nail against the paper, her lips pursed in concentration. "You’re telling me we’re hitting a *kids’ party*?" Bill said, rubbing his temples like he was trying to erase the idea from his brain. "I’ve robbed banks, I’ve lifted jewels from under the noses of guys with guns, but this? This is where I draw the line." Molly smirked, flipping her ponytail over her shoulder. "Oh, come on. It’s perfect. No security, no alarms, just a bunch of sugar-high six-year-olds and a clown who’s probably too busy crying in the bathroom to notice us." Bill groaned, slumping back in the booth. "I *hate* clowns. And kids. And clowns who work kids’ parties. They’re like the unholy offspring of a mime and a tax auditor." "Focus, Bill." Molly leaned in, her voice dropping to a conspiratorial whisper. "The birthday girl’s dad is some tech bro who just sold his app for, like, a billion dollars. The presents are gonna be *insane*. I’m talking iPads, designer sneakers, maybe even a pony." Bill’s eyes narrowed. "A pony?" "Okay, probably not a pony. But still. We’re talking high-end loot, and the best part? No one’s gonna suspect a thing. We just walk in, grab the gifts, and walk out. Easy." Bill picked up a fry, examined it like it held the secrets of the universe, then tossed it back onto the plate. "And how, exactly, do you propose we *walk in*? We can’t just stroll through the front door. ‘Oh hey, sorry to interrupt the magic show, just here to liberate some Barbie Dreamhouses.’" Molly grinned, pulling a crumpled flyer from her pocket. "We’re caterers." Bill blinked. "We’re *what* now?" "Caterers. Look." She smoothed out the flyer on the table. It was for *Sugar Rush Catering*, complete with a cartoon cupcake wearing sunglasses. "I swiped this from the community board at the grocery store. All we gotta do is show up in aprons, act like we belong, and boom—access granted." Bill stared at her. "You want us to *dress up as caterers*." "Yes." "And then just… take the presents." "Yes." "And what if someone *notices*? Like, say, the actual caterers?" Molly waved a hand. "Please. Have you ever met a caterer at a kids’ party? They’re too busy dodging flying juice boxes and pretending not to hear the parents complain about the gluten-free options. We’ll be in and out before anyone even realizes the cake’s missing." Bill sighed, rubbing his face. "I can’t believe I’m saying this, but… fine. But if we get arrested, I’m telling the cops you forced me into this." Molly clapped her hands together. "That’s the spirit! Now, first order of business—we need aprons. And maybe some hairnets. Nothing says ‘trustworthy’ like a hairnet." Bill groaned again, louder this time. "I’d rather get caught."
created_at
"2026-02-20T14:18:33.559Z"
2/20/2026, 2:18:33 PM
model
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
349
moderation_latency
(null)
generation_time
15129
tokens_prompt
77
tokens_completion
805
native_tokens_prompt
69
native_tokens_completion
765
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.001182
router
(null)
provider_responses
0
endpoint_id
"fa718841-07ea-4b7e-b8a3-6069545c7e6a"
id
"fb223656877343d3981d49429614db53"
is_byok
false
latency
349
model_permaslug
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771597113-yK6IxvUg7b3ilsI5Vu3c"
upstream_id
"fb223656877343d3981d49429614db53"
total_cost
0.001182
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
20.19%
Matches word count
n/a
neededClean
false
words
520
0.00%
Dialogue to Total Word Ratio
Ratio: 65.19%, Deviation: 35.19%
neededClean
false
wordsTotal
520
wordsDialogue
339
10.0948%