NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXQ9EFPZ504A910QPSG1ZS5
Run Details
Model:
Mistral Large 3
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
14.8s
Cost:
$0.001087
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
35.00%
Matches word count
(details)
40.12%
Dialogue to Total Word Ratio
(details)
37.5611%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the glittery unicorn horn strapped to her head and squinted at the inflatable bounce house sagging in the backyard. "You’re telling me we’re hitting a *kids’ party*? Like, actual toddlers are gonna be here in twenty minutes." Bill, already halfway into a rainbow-striped onesie, zipped up the front with a flourish. "Exactly. Who suspects the guy in the giant duck costume? Besides, these parents go *all out*—iPads, gift cards, probably a few uncut diamonds if we’re lucky." Molly plucked a juice box from the snack table and stabbed it with the straw. "We’re not lucky. Last time we were lucky, we walked out of a bank with a toaster and a coupon for free dry cleaning." "Details." Bill waved a hand. "Look, the security here is a guy named Dave who thinks ‘vigilance’ means remembering to refill the Cheetos bowl. We waltz in, we waltz out. Easy." Molly took a long sip of apple juice. "Easy for you. You’re the one who gets to be the duck. I’m stuck as the ‘fun aunt’ who ‘just loves kids.’ I don’t love kids. I don’t even like kids. Last week, a kid at the grocery store called me a ‘mean lady’ because I wouldn’t let him lick the freezer glass." Bill patted her shoulder. "That’s why you’re perfect. You’ve got the *resting mom face*. No one questions a woman who looks like she’s three seconds from a Xanax." Molly glared. "I do not have a resting mom face." "Babe, you once made a cop cry just by sighing." She threw the empty juice box at him. "Shut up. What’s the play here? We can’t just grab the presents and bolt. There’s, like, a piñata involved." Bill pulled a folded piece of paper from his onesie pocket. "Phase one: I distract with my *legendary* duck dance. Phase two: you ‘volunteer’ to help with the gift table. Phase three: we meet at the van with the loot." Molly stared. "That’s it? That’s your plan?" "Simplicity is genius." "Simplicity is how people get caught. What if some kid sees me stuffing a Nintendo Switch into my purse?" Bill grinned. "Then you tell them it’s a *very* realistic toy. Kids believe anything." Molly groaned. "I hate this. I hate *all* of this. Why couldn’t we rob a nice, quiet retirement home? Old people don’t even *notice* when their stuff goes missing." "Because old people don’t have PS5s, Molly. Priorities." She adjusted her unicorn horn again, wincing as the elastic dug into her scalp. "Fine. But if I get arrested, I’m telling the cops you made me wear this." Bill clapped his hands. "That’s the spirit! Now, let’s go steal from some preschoolers before they realize their presents are *way* nicer than their actual lives." Molly muttered under her breath as she followed him toward the party. "I need a new career. Maybe something in accounting."
created_at
"2026-02-20T14:29:33.313Z"
2/20/2026, 2:29:33 PM
model
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
392
moderation_latency
(null)
generation_time
14787
tokens_prompt
77
tokens_completion
741
native_tokens_prompt
69
native_tokens_completion
702
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0010875
router
(null)
provider_responses
0
endpoint_id
"fa718841-07ea-4b7e-b8a3-6069545c7e6a"
id
"826e86391e584c409c15e587779b4753"
is_byok
false
latency
392
model_permaslug
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771597773-54CIQN0jR10QSgIwzChI"
upstream_id
"826e86391e584c409c15e587779b4753"
total_cost
0.0010875
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
35.00%
Matches word count
n/a
neededClean
false
words
482
40.12%
Dialogue to Total Word Ratio
Ratio: 74.27%, Deviation: 4.27%
neededClean
false
wordsTotal
482
wordsDialogue
358
37.5611%