NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYB574YXC33JH3A3PT6EPJR
Run Details
Model:
Ministral 3 3B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
2.2s
Cost:
$0.000067
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.01%
Dialogue to Total Word Ratio
(details)
0.0051%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The neon glow of the party’s disco ball pulsed like a dying star, casting jagged shadows across the walls of the abandoned warehouse. Molly adjusted her gloves—too tight, too loud—and Bill groaned, rolling his eyes. “You’re gonna trip over your own feet if you keep that *dramatic* sighing,” Bill muttered, already peeling off his jacket. “We’re not here to watch a soap opera.” Molly shot him a look. “I’m not *dramatic*, I’m *prepared*.” She pulled a tiny lockpick from her sleeve and tested it against the door’s rusted hinge. “Besides, what if the presents are *magical*? We need to be ready for *everything*.” Bill groaned again, this time louder. “You’re impossible.” “And you’re *boring*,” Molly shot back. “Fine, let’s do this. But if we get caught, I’m blaming you for not listening to the plan.” The door creaked open with a groan, revealing a hallway lined with flickering fluorescent lights. Bill stepped forward, his flashlight cutting through the gloom. “Okay, so we’re in. Now, what’s the *real* plan?” Molly’s eyes narrowed. “We don’t have a plan. We just *know* we’re gonna get away with it.” Bill rolled his eyes. “You’re *insane*.” “And you’re *terrible* at this,” she countered, already scanning the room. “Look—there’s a supply closet. Perfect. We’ll hide the presents there.” Bill groaned. “You’re *terrible* at this.” Molly ignored him, already sprinting toward the closet. “Fine, fine! But if we’re caught, I’m *not* the one apologizing!” Bill caught up, shoving her aside. “You’re *terrible* at this.” The door to the supply closet creaked open, revealing a mountain of boxes—some labeled *TOY*, others just *BOX*. Bill grabbed a stack and started pilfering. “Okay, so we’re *not* stealing *everything*,” he said, tossing a toy truck into his bag. “Just the *good* ones.” Molly snatched a doll from his hand. “That’s *not* good. That’s *cute*.” Bill groaned. “You’re *terrible* at this.” Molly sighed, already pulling out a pair of scissors. “Fine. We’ll just *cut* the presents.” Bill stared at her. “You’re *terrible* at this.” The warehouse fell silent except for the distant hum of the party’s music, now just a faint echo. Molly smirked. “We’re gonna be *legendary*.” Bill groaned. “You’re *terrible* at this.”
created_at
"2026-02-20T20:16:46.246Z"
2/20/2026, 8:16:46 PM
model
"mistralai/ministral-3b-2512"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
180
moderation_latency
(null)
generation_time
1910
tokens_prompt
77
tokens_completion
653
native_tokens_prompt
69
native_tokens_completion
601
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000067
router
(null)
provider_responses
0
endpoint_id
"b9f6828f-aef9-498a-91f5-6816fbf72420"
id
"e81be153659f450abb7b3981c454cf27"
is_byok
false
latency
180
model_permaslug
"mistralai/ministral-3b-2512"
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771618606-P2cUdG1J9CF47QnbRPsG"
upstream_id
"e81be153659f450abb7b3981c454cf27"
total_cost
0.000067
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
365
0.01%
Dialogue to Total Word Ratio
Ratio: 43.56%, Deviation: 13.56%
neededClean
false
wordsTotal
365
wordsDialogue
159
0.0051%