NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPDRZCN2GYMDDGRB0HW3HF
Run Details
Model:
Mistral Medium 3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
13.4s
Cost:
$0.001596
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The neon glow of the *Happy Birthday, Timmy!* sign flickered through the miniblinds of the abandoned ice cream truck where Molly and Bill crouched over a blueprint of the McAllister mansion. The blueprint was actually a crayon drawing Molly had swiped from Timmy’s little sister during a “casual” playground reconnaissance mission, but it was the best they had. Molly tapped a chipped nail against the sketch where a lopsided balloon arch marked the gift table. “Alright, genius, talk me through this again. We’re hitting a six-year-old’s birthday party. Not a bank. Not a museum. A *bounce house adjacent* crime scene.” Bill adjusted his fake mustache—a choice Molly had vetoed three times already—for the seventh time in as many minutes. “It’s *perfect*. No security. No alarms. Just a bunch of sugar-crazed kids and parents too busy Instagramming the cupcake tower to notice two caterers ‘accidentally’ walking out with a truckload of LEGOs and iPads.” “Caterers?” Molly arched an eyebrow. “We look like caterers?” “We *will* after I put on this apron.” Bill flourished a stained white apron that read *Kiss the Cook (But Not Really)* in peeling letters. “And you’ll be my assistant. The *sexy* assistant.” Molly snatched the apron and hurled it into a puddle of melted rainbow sherbet on the truck floor. “I’d rather get caught. Also, we’re stealing from *children*.” “Correction: we’re *redistributing* from children who already have too much stuff to children who—okay, fine, to *us*.” Bill rubbed his temples. “Look, the McAllisters insured these gifts for, like, five grand. They’ll get a check, we get a payday, Timmy gets a lifetime of therapy. Everyone wins.” Molly sighed, flipping open a switchblade to pick at her cuticles. “What’s the play?” “Phase one: we show up as ‘Magical Balloon Artists.’” Bill air-quoted with jazz hands. “Phase two: I ‘accidentally’ pop all the balloons—distraction. Phase three: you swap the real presents with the fakes we packed.” He patted a duffel bag full of bricks wrapped in *Frozen 2* paper. “Fakes?” Molly’s knife paused mid-pick. “You *wrapped bricks*?” “Not *just* bricks! I put a single Jolly Rancher in each box. For *authenticity*.” Molly stared at him. “You’re a monster.” “A *prepared* monster.” Bill checked his watch. “We’ve got twenty minutes before the piñata massacre. You in?” Molly exhaled, then grabbed a spare clown nose from the dashboard and jammed it onto her face. “Fine. But if I have to sing ‘Baby Shark’ to sell this, I’m turning us both in.” Bill grinned, slinging the duffel over his shoulder. “Deal. But if we get caught, you’re the one who insisted on the clown nose. That’s *not* in the blueprint.” Molly adjusted the nose with a scowl. “It’s *tactical*. No one suspects a clown.” “No one *likes* a clown, either.” They climbed out of the truck, the distant sound of “Happy Birthday” warbling through the suburban air like a siren song for the morally flexible. Molly cracked her knuckles. “Alright, partner. Let’s go ruin childhood.”
created_at
"2026-02-20T14:14:26.551Z"
2/20/2026, 2:14:26 PM
model
"mistralai/mistral-medium-3.1"
3/1/2001, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
261
moderation_latency
(null)
generation_time
13334
tokens_prompt
77
tokens_completion
844
native_tokens_prompt
69
native_tokens_completion
784
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0015956
router
(null)
provider_responses
0
endpoint_id
"13c5002b-8c5b-490c-8a5f-52ec0242804f"
id
"557cb7f890094f7bb54c8d5669cca595"
is_byok
false
latency
261
model_permaslug
"mistralai/mistral-medium-3.1"
3/1/2001, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771596866-3Wjt5opNK9OLLMZctH0x"
upstream_id
"557cb7f890094f7bb54c8d5669cca595"
total_cost
0.0015956
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
0.00%
Dialogue to Total Word Ratio
Ratio: 52.19%, Deviation: 22.19%
neededClean
false
wordsTotal
502
wordsDialogue
262
49.9920%