NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPNAV718R7TWZ64NX3P1WX
Run Details
Model:
Mistral Medium 3.1
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
6.6s
Cost:
$0.000770
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rainbow-bright party hats bobbed like buoys in a sea of sugar-high chaos as Molly crouched behind a suspiciously sticky hedge, adjusting her fake name tag—*"Aunt Carol (Allergies: Glitter)"*. Beside her, Bill fidgeted with a balloon animal he’d "borrowed" from a wailing toddler, twisting it into something that vaguely resembled a giraffe with a gambling problem. “You realize,” Molly muttered, eyeing the piñata shaped like a unicorn mid-seizure, “that we’re about to commit felony larceny against a room full of children who still believe in the Easter Bunny.” Bill squinted at the gift table, a mountain of wrapping paper and ribbon guarded only by a single mom sipping wine from a juice box. “Technically, we’re *redistributing* joy. Robin Hood had a code, Carol.” “My name is *Molly* when we’re not in character,” she hissed, swatting his hand away from a half-eaten cupcake. “And Robin Hood didn’t have to explain to a six-year-old why Santa’s little helpers just stole her My Little Pony playset.” Bill grinned, tucking the balloon giraffe into his waistband like a holster. “Kid’ll think it was magic. Now, you distract the mom with your ‘nut allergy’ sob story, and I’ll—” “—get us arrested before the cake’s even cut,” Molly finished, watching as a child nearby face-planted into a bowl of punch. “God, we’re the worst.” She adjusted her name tag again. “Fine. But if we go to jail, I’m telling everyone you cried during *Frozen*.”
created_at
"2026-02-20T14:18:34.226Z"
2/20/2026, 2:18:34 PM
model
"mistralai/mistral-medium-3.1"
3/1/2001, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
509
moderation_latency
(null)
generation_time
6538
tokens_prompt
77
tokens_completion
397
native_tokens_prompt
69
native_tokens_completion
371
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0007696
router
(null)
provider_responses
0
endpoint_id
"13c5002b-8c5b-490c-8a5f-52ec0242804f"
id
"d97a8eeb134a4862923b75979d7df1dd"
is_byok
false
latency
509
model_permaslug
"mistralai/mistral-medium-3.1"
3/1/2001, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771597114-4iM6nHWRRLV6jFuw8GmF"
upstream_id
"d97a8eeb134a4862923b75979d7df1dd"
total_cost
0.0007696
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
245
0.00%
Dialogue to Total Word Ratio
Ratio: 44.90%, Deviation: 34.90%
neededClean
false
wordsTotal
245
wordsDialogue
110
0.0000%