NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYA22T6TY5H3P1G18NMHHCY
Run Details
Model:
Ministral 3 3B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
2.2s
Cost:
$0.000066
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
87.17%
Dialogue to Total Word Ratio
(details)
43.5829%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
**Title: *The Great Toy Heist*** The neon glow of the party’s disco ball flickered against Molly’s cheek as she adjusted her sunglasses—*just in case*—while Bill leaned against the fence, arms crossed, eyeing the crowd like a hungry lion. “You’re *sure* this is the right place?” Bill asked, voice dripping with skepticism. “Because if it’s not, we’re gonna have to start over, and I *hate* starting over.” Molly rolled her eyes, tossing a candy wrapper into the bushes. “It’s *obviously* the right place. Look—” She pointed at the glittering cake. “That’s a *cake*. And that’s a *cake* on the table. And—” She gasped dramatically. “—there’s a *cake* in the freezer!” Bill groaned. “Molly, we’re not here to eat. We’re here to *steal*.” “Exactly!” She grinned. “And the best way to steal is to *blend in*.” “You’re *blending* into the background like a human confetti cannon.” “Shut up and help me find the back door.” Bill groaned again, but he followed her toward the alley, where the partygoers’ laughter echoed like a siren call. Molly stopped at the first house, peering through the window. “Perfect. Kids are all inside, parents are *probably* distracted by the cake.” “Or they’re all just drunk,” Bill muttered. “Or they’re all *too drunk* to notice us,” Molly corrected. “Now—” She grabbed a nearby trash can and hurled it at the fence. It bounced off harmlessly. “Not good enough.” Bill sighed. “We need a distraction.” “A *distraction*?” Molly’s eyes widened. “Like… a *fake emergency*?” “Or a *very* dramatic explosion.” “Oh, you’re *so* dramatic.” She grabbed a fire extinguisher from the shed and aimed it at the fence. “Here. *This* is a distraction.” Bill caught it just in time, holding it up like a shield. “Molly, we’re *not* doing this.” “You’re *not* the one who’s gonna get us caught,” she shot back. “Now—*whoosh!*” She sprayed the extinguisher at the fence, sending a cloud of foam into the air. The kids screamed. The parents turned. And just like that, the party was *over*. Bill groaned. “We’re *so* screwed.” Molly grinned. “Not yet. Let’s go.” She grabbed his arm and dragged him toward the fence. “We’ve got a *lot* of presents to steal.”
created_at
"2026-02-20T19:57:34.927Z"
2/20/2026, 7:57:34 PM
model
"mistralai/ministral-3b-2512"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
226
moderation_latency
(null)
generation_time
1951
tokens_prompt
77
tokens_completion
639
native_tokens_prompt
69
native_tokens_completion
588
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0000657
router
(null)
provider_responses
0
endpoint_id
"b9f6828f-aef9-498a-91f5-6816fbf72420"
id
"1de6af9b60ff48169d84d3f85df9732e"
is_byok
false
latency
226
model_permaslug
"mistralai/ministral-3b-2512"
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771617454-5JbV0JY5uFItd9BMD44A"
upstream_id
"1de6af9b60ff48169d84d3f85df9732e"
total_cost
0.0000657
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
362
87.17%
Dialogue to Total Word Ratio
Ratio: 48.34%, Deviation: 1.66%
neededClean
false
wordsTotal
362
wordsDialogue
175
43.5829%