NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY8M7EX0G5EZKSWGH161TQG
Run Details
Model:
Ministral 3 14B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
6.0s
Cost:
$0.000164
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
92.91%
Dialogue to Total Word Ratio
(details)
96.4147%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The backseat of Bill’s rusted-out van smelled like stale coffee and regret, which, honestly, was an improvement over the last job where it had smelled like regret and *also* regret. Molly adjusted her ski mask—black, because subtlety was her middle name—and leaned over the dashboard, where a half-eaten bag of chips had taken up permanent residence. “You sure this is the right house?” she asked, squinting at the GPS. “Because the last time we ‘case-jobbed’ a place based on a Waze suggestion, we ended up in a *baptist church potluck*.” Bill adjusted his own mask, which was more of a bandana tied around his forehead like a very confused pirate. “Relax, Mol. I double-checked. Seven-year-old’s birthday. *Loot city.*” “Loot city?” Molly scoffed. “That’s not a thing. Also, how do you even know it’s a seven-year-old? Could be a seventy-year-old in a onesie.” “Because the flyer said ‘Dinosaur Party’ and you don’t see many septuagenarians dressing up as T-Rexes unless they’re *really* committed to their fantasy.” Molly pulled out a crumpled flyer from her pocket and unfolded it with the reverence of a monk handling a sacred text. “‘Jurassic Joy’s Big Dino Bash!’” she read. “Okay, fine. But we’re not just walking in there like it’s a *normal* party. We need a plan.” Bill grinned, revealing a gold tooth that had seen better decades. “Oh, we’ve got a plan. Step one: We show up late. Step two: We play the confused but charming aunt and uncle who ‘forgot the invite.’ Step three: We *accidentally* knock over the piñata.” Molly blinked. “That’s it?” “That’s it. Chaos covers a multitude of sins, Mol. While everyone’s distracted by confetti and broken toys, we hit the present table like it’s Black Friday at Walmart.” “And what if someone recognizes us?” Bill shrugged. “Then we improvise. You’re good at that.” “Improvise? Bill, the last time we improvised, we got chased by a *dog* and had to climb a fence while one of us was wearing a *sparkly leotard*.” “That was *one time* and the dog was tiny!” “It had *teeth*.” Bill waved a hand. “Details. Look, the key here is *timing*. We hit when the parents are distracted, the kids are hyped, and the presents are *right there*, begging to be taken. It’s almost *too* easy.” Molly stared at him. “You’re telling me we’re breaking into a *children’s party* because it’s *too easy*?” “Exactly! Low security, high reward. Plus, if we get caught, we can just say we were looking for the bathroom.” Molly groaned, rubbing her temples. “We are *terrible* people.” “No, we’re *efficient* people. Now, what’s the move? You wanna case the joint tonight, or we just roll in tomorrow and hope for the best?” Molly sighed, pulling out a pair of latex gloves from her pocket. “Fine. But if I see *one* more balloon animal, I’m throwing you into the punch bowl.” Bill clapped his hands together. “Now *that’s* a plan I can get behind.”
created_at
"2026-02-20T19:32:32.361Z"
2/20/2026, 7:32:32 PM
model
"mistralai/ministral-14b-2512"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
212
moderation_latency
(null)
generation_time
5773
tokens_prompt
77
tokens_completion
816
native_tokens_prompt
69
native_tokens_completion
753
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0001644
router
(null)
provider_responses
0
endpoint_id
"0b53717b-9af5-4de3-80af-ebedd2adf981"
id
"8b723d8f57f847fb9418d793f214cbd1"
is_byok
false
latency
212
model_permaslug
"mistralai/ministral-14b-2512"
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771615952-EyqGpxgiu3qrWD4BCGmd"
upstream_id
"8b723d8f57f847fb9418d793f214cbd1"
total_cost
0.0001644
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
503
92.91%
Dialogue to Total Word Ratio
Ratio: 68.79%, Deviation: 1.21%
neededClean
false
wordsTotal
503
wordsDialogue
346
96.4147%