NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPJN2BCN2RPBXQ07YXM4Q5
Run Details
Model:
Mistral Medium 3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18.0s
Cost:
$0.001724
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
14.24%
Dialogue to Total Word Ratio
(details)
7.1222%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The dim glow of a flickering streetlamp cast long shadows across the alley where Molly and Bill huddled over a crumpled party invitation. Molly adjusted her oversized sunglasses—indoors, at night—while Bill fiddled with a party popper like it was a detonator. “Alright, let’s run this one more time,” Molly said, tapping the invitation. “We’re crashing a six-year-old’s birthday party. Not a bank. Not a museum. A *kids’ party*.” Bill grinned. “Low risk, high reward. These rich parents drop hundreds on LEGO sets and iPads like it’s nothing. And security? A clown with a balloon animal addiction.” “A clown,” Molly repeated flatly. “You’re telling me our biggest obstacle is a guy who can’t tie his own shoes without making a giraffe?” “Exactly! And if he *does* get suspicious, we just tell him we’re the magicians. Boom. Instant cover.” Molly rubbed her temples. “We’re not magicians, Bill. I can’t even do a card trick without setting something on fire.” “Details.” Bill waved a hand. “Look, Phase One: We show up fashionably late, blend in with the parents. You wear the sundress, I’ll rock the dad jeans and a ‘World’s Okayest Golfer’ cap. Classic.” “You own dad jeans?” “Borrowed them from my *actual* dad. He won’t notice. He’s got, like, twelve pairs.” Molly sighed. “Phase Two?” “We hit the present table during musical chairs. Total chaos. Kids screaming, parents filming, no one notices us swapping the good stuff with the decoy gifts we brought.” “Decoy gifts?” Bill patted a duffel bag. “A bunch of rocks I spray-painted gold and a whoopee cushion shaped like a tablet. Kid opens it, gets a fart noise, parents laugh, we’re long gone.” Molly stared. “You put *that* much thought into the whoopee cushion?” “Artistry, Mol. It’s all about the details.” Bill leaned in. “Phase Three: The clown’s the getaway driver.” “The *clown*?” “Yeah! He’s got that tiny car. Perfect for a quick escape. No one suspects the guy with the rainbow wig.” Molly pinched the bridge of her nose. “Bill, we are *not* fitting in a clown car with a haul of stolen iPads.” “We’ll make it work! It’s like Tetris, but with more felonies.” A distant siren wailed. Molly stiffened. “Was that for us?” Bill peeked around the corner. “Nah, just some guy trying to parallel park a food truck. Relax.” Molly exhaled. “Alright, fine. But if we get caught, I’m telling the cops *you* were the mastermind.” Bill gasped, clutching his chest. “Et tu, Mol? After all we’ve been through?” “After the *last* heist, where you tried to rob a bakery by pretending to be a health inspector and ate all the evidence?” “That was a *flawless* plan until the gluten intolerance lie fell apart!” Molly checked her watch. “We’ve got twenty minutes till showtime. You got the fake RSVP cards?” Bill pulled out two glittery cards with scribbled names. “‘Mr. and Mrs. Jenkins.’” Molly read them. “It says ‘Mr. and Mrs. *Jenkinssss*.’ With four S’s.” “Adds mystique.” “It adds illiteracy.” Molly groaned. “Just… stick to the plan. In, out, no clown-related incidents.” Bill saluted. “No promises on the clown part.” Molly took a deep breath. “I hate my life.” Bill grinned. “But you love the whoopee cushion.” She didn’t deny it.
created_at
"2026-02-20T14:17:06.392Z"
2/20/2026, 2:17:06 PM
model
"mistralai/mistral-medium-3.1"
3/1/2001, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
260
moderation_latency
(null)
generation_time
17880
tokens_prompt
77
tokens_completion
936
native_tokens_prompt
69
native_tokens_completion
848
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0017236
router
(null)
provider_responses
0
endpoint_id
"13c5002b-8c5b-490c-8a5f-52ec0242804f"
id
"7e2e6603d3f542f9841e62511fda6793"
is_byok
false
latency
259
model_permaslug
"mistralai/mistral-medium-3.1"
3/1/2001, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771597026-tPAHLonzdjSePuGu5cCL"
upstream_id
"7e2e6603d3f542f9841e62511fda6793"
total_cost
0.0017236
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
543
14.24%
Dialogue to Total Word Ratio
Ratio: 76.24%, Deviation: 6.24%
neededClean
false
wordsTotal
543
wordsDialogue
414
7.1222%