NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPGZEAZYQ3T8SWXYYP1ZH3
Run Details
Model:
Mistral Large 3
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
17.2s
Cost:
$0.001302
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
51.50%
Dialogue to Total Word Ratio
(details)
25.7493%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the glittery unicorn horn strapped to her forehead and squinted at the inflatable bounce house sagging under the weight of a dozen sugar-crazed five-year-olds. "You’re sure this is the target? Because I’m pretty sure we just robbed a daycare by accident." Bill, resplendent in a dinosaur onesie that made him look like a rejected *Jurassic Park* extra, scratched his chin. "Positive. The invite said ‘Timmy’s Magical Unicorn Extravaganza.’ That’s code for ‘rich parents, expensive gifts.’" "Code? That’s just a kid’s birthday party." "Exactly. No one suspects the guy in the T-Rex costume. It’s the perfect cover." Molly sighed, watching a toddler hurl a cupcake at a clown. "I don’t know, Bill. This feels wrong. We’re about to steal from a bunch of kids. That’s, like, *supervillain* territory." Bill waved a dismissive hand. "Please. We’re not stealing from *kids*. We’re stealing from *parents*. Big difference. Those people drop two hundred bucks on a cake shaped like a dump truck. They won’t even notice the missing presents." "Yeah, but—" Molly flinched as a rogue Nerf dart whizzed past her ear. "What if one of them *does* notice? What if Timmy’s dad is, like, a black belt in karate or something?" Bill scoffed. "Timmy’s dad is wearing a fanny pack and arguing with the bartender about the difference between ‘craft beer’ and ‘regular beer.’ He’s not a threat. Besides, we’re not hurting anyone. We’re just… relocating some high-end toys." Molly groaned. "You make it sound so noble. ‘Oh, we’re just Robin Hood, but for action figures.’" "Exactly! And think about it—those kids are gonna get *way* better presents next year. This is practically a public service." A harried-looking mom shoved a goody bag into Molly’s hands. "Here, sweetie, make sure you get one before they run out!" Molly blinked at the bag, then at Bill. "Uh. Thanks?" Bill snatched the bag and peeked inside. "Jackpot. Mini slime kits. These things retail for, like, fifteen bucks each." Molly rolled her eyes. "We’re not stealing *slime*, Bill." "Why not? It’s *liquid gold*. Literally. Kids lose their minds over this stuff." He stuffed the bag into his onesie pouch. "See? We’re already in the black." Molly pinched the bridge of her nose. "We haven’t even *taken* anything yet. And how are we supposed to carry out, like, a dozen wrapped presents without looking suspicious?" Bill grinned. "Easy. We tell them we’re the ‘present fairies.’" Molly stared at him. "The *what* now?" "You know, like the tooth fairy, but for birthdays. We show up, wave a magic wand—" He pulled a glittery stick from his pocket. "—and *poof*, the presents vanish. Parents eat that stuff up. They’ll think it’s part of the entertainment." Molly opened her mouth, then closed it. "That’s… actually not the worst idea you’ve ever had." Bill clapped his hands. "I *knew* you’d come around. Now, phase one: blend in. Phase two: locate the gift table. Phase three: *abracadabra*, we’re rich." Molly adjusted her unicorn horn again. "I can’t believe I let you talk me into this." Bill slung an arm around her shoulders. "That’s the spirit. Now, let’s go steal some Legos before the piñata explodes." Molly sighed. "I hate my life."
created_at
"2026-02-20T14:16:11.48Z"
2/20/2026, 2:16:11 PM
model
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
93
moderation_latency
(null)
generation_time
17055
tokens_prompt
77
tokens_completion
881
native_tokens_prompt
69
native_tokens_completion
845
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.001302
router
(null)
provider_responses
0
endpoint_id
"fa718841-07ea-4b7e-b8a3-6069545c7e6a"
id
"3a75cc417d3e4939a8edc4db8b31085e"
is_byok
false
latency
92
model_permaslug
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771596971-OINgGDIy4Z4q7HqI4sPq"
upstream_id
"3a75cc417d3e4939a8edc4db8b31085e"
total_cost
0.001302
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
538
51.50%
Dialogue to Total Word Ratio
Ratio: 66.36%, Deviation: 3.64%
neededClean
false
wordsTotal
538
wordsDialogue
357
25.7493%