NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPKKYTG6ZC8YCF43Q9GNF7
Run Details
Model:
Mistral Large 3
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
15.4s
Cost:
$0.001123
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
46.51%
Dialogue to Total Word Ratio
(details)
44.9420%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the glittery unicorn horn strapped to her forehead and squinted at the inflatable bounce house sagging under the weight of a dozen sugar-crazed six-year-olds. "You’re telling me we’re hitting a *princess-themed* heist? I didn’t sign up to be outshone by a five-year-old in a tutu." Bill, currently wrestling a piñata shaped like a dragon into the back of their van, shot her a look. "You’re the one who said we needed a ‘low-risk, high-reward’ job. Well, congratulations, Mol. This is *exactly* that. No guards, no lasers, just a bunch of toddlers who think we’re the entertainment." Molly plucked a cupcake from a nearby table and peeled off the wrapper. "I said *low-risk*, not *humiliating*. We’re dressed like a rejected My Little Pony spin-off. I look like I lost a fight with a glitter factory." Bill yanked the cupcake out of her hand and took a bite. "You look *festive*. And stop eating the evidence. We’re supposed to be stealing the presents, not the snacks." "Says the guy who just inhaled half a frosting rose." Molly snatched the cupcake back. "Also, I *told* you we should’ve gone with the clown outfits. At least then we could’ve blamed the theft on ‘the magic of misdirection.’" Bill wiped frosting off his chin. "Clowns are *way* more suspicious. Parents see a clown at a kid’s party, they start Googling ‘how to perform an exorcism.’ No, this is perfect. We’re the *magical unicorn and dragon duo*. We’re *trusted*. We’re *beloved*." Molly groaned. "We’re *one wrong move away from being on a milk carton*." Bill ignored her and pulled out a walkie-talkie. "Alright, phase one: distraction. You get the kids hyped up on juice boxes while I ‘accidentally’ knock over the gift table. Classic misdirection." Molly stared at him. "You want me to *entertain* a bunch of children? Bill, I once threatened a guy with a stapler for looking at me funny. I am *not* qualified for this." "Just do the thing where you make the balloon animals. You’re good at that." "I make *one* balloon sword for a kid’s birthday and suddenly I’m the next Picasso of twisted latex? No. Absolutely not. I refuse to be remembered as ‘that scary lady who made a balloon poodle that looked like a crime against nature.’" Bill sighed. "Fine. Then just stand there and look pretty. Wave your horn around. Do *something* so they’re not staring at me while I ‘reorganize’ the presents." Molly adjusted her horn again. "You realize if we get caught, our mugshots are gonna be *adorable*, right? ‘Local Criminals Apprehended in Sparkly Heist Gone Wrong.’ We’ll be memes before we even hit booking." Bill grinned. "That’s the spirit. Now come on, *Princess Sparklehoof*. Let’s go rob some kids blind." Molly muttered something unprintable and followed him toward the party, already plotting how to frame the nearest clown.
created_at
"2026-02-20T14:17:38.023Z"
2/20/2026, 2:17:38 PM
model
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
642
moderation_latency
(null)
generation_time
15361
tokens_prompt
77
tokens_completion
756
native_tokens_prompt
69
native_tokens_completion
726
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0011235
router
(null)
provider_responses
0
endpoint_id
"fa718841-07ea-4b7e-b8a3-6069545c7e6a"
id
"baac515fe0114d90aa9a323034f1120a"
is_byok
false
latency
642
model_permaslug
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771597058-tULVDXFKDl1UWFBhoe9G"
upstream_id
"baac515fe0114d90aa9a323034f1120a"
total_cost
0.0011235
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
46.51%
Dialogue to Total Word Ratio
Ratio: 73.91%, Deviation: 3.91%
neededClean
false
wordsTotal
483
wordsDialogue
357
44.9420%