NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9GM4KW7RF4EJ4FNW72FWA
Run Details
Model:
Ministral 3 3B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
2.1s
Cost:
$0.000067
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
6.04%
Dialogue to Total Word Ratio
(details)
3.0181%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
**Title: *The Great Toy Heist*** Molly adjusted her sunglasses, peering through the window of the gleaming mansion. "Bill, we’re gonna need a distraction. Like, *a lot* of distraction." Bill wiped his brow with a napkin, already calculating. "I’ve got it. We’ll hit the front door at midnight. I’ll knock on the door like a drunk uncle, and you’ll—" "*You’ll what?*" "*You’ll sneak in through the kitchen window. Classic. The party’s in the basement, right? So we’ll—*" "*Bill, we’re not kids. We’re not even *good* at this.*" "*Then we’ll be *really* good at it.*" He grinned, tossing a stolen flashlight onto the counter. "Here. And we’ll need these. *All* of them." Molly snatched a handful of fake party hats from the table. "You’re *insane.*" She tossed one back at him. "But you’re right. We *have* to be loud. The cops will never suspect a thing if we’re making noise." Bill clapped his hands together. "Perfect. So, we’ll—" "*We’ll what?*" "*We’ll pretend to be a couple of drunk parents. Like, *very* drunk. We’ll barge in, steal the presents, and then—*" "*And then what?*" "*And then we’ll disappear into the night like ghosts.*" Molly groaned. "You’re *terrible.*" She grabbed a bag of candy from the counter. "Fine. But if we get caught, I’m blaming you." Bill smirked, already pulling out a fake mustache from his pocket. "Deal. Now, who’s gonna be the *drunk* one?" Molly sighed, but she couldn’t help laughing. "I’ll be the one who trips over the rug." Bill grinned. "Perfect. Now, let’s go. And *Bill*, if we get caught, I’m telling the cops you were the one who screamed." Molly rolled her eyes but couldn’t stop the smile. "You’re *worse.*" Bill winked. "And you’re *my* favorite." He grabbed the flashlight and the bag of candy. "Let’s do this." Molly grabbed her sunglasses and followed him out the door, already plotting their next move. "We’re gonna be *legendary.*" Bill paused at the door. "Only if we don’t get caught." Molly rolled her eyes again. "Oh, we’re *definitely* getting caught." Bill grinned. "Then we’ll just make it *fun.*" And with that, the two thieves vanished into the night, ready to pull off the most chaotic heist of their lives.
created_at
"2026-02-20T19:48:02.844Z"
2/20/2026, 7:48:02 PM
model
"mistralai/ministral-3b-2512"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
187
moderation_latency
(null)
generation_time
1853
tokens_prompt
77
tokens_completion
614
native_tokens_prompt
69
native_tokens_completion
596
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0000665
router
(null)
provider_responses
0
endpoint_id
"b9f6828f-aef9-498a-91f5-6816fbf72420"
id
"3311ec3e65154826b4f09572e3b5df04"
is_byok
false
latency
187
model_permaslug
"mistralai/ministral-3b-2512"
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771616882-ZlsC5TnOLwFwfcVCSJA4"
upstream_id
"3311ec3e65154826b4f09572e3b5df04"
total_cost
0.0000665
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
367
6.04%
Dialogue to Total Word Ratio
Ratio: 57.49%, Deviation: 7.49%
neededClean
false
wordsTotal
367
wordsDialogue
211
3.0181%