NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM083R049D4EK2F06G5JT188
Run Details
Model:
Mistral Small 4
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
3.5s
Cost:
$0.000209
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
97.76%
Dialogue to Total Word Ratio
(details)
79.0157%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The neon glow of a birthday sign flickered above the bakery-turned-party-venue. Molly adjusted her cat-ears, squinting through the window at a piñata shaped like a unicorn. "You sure this is the one?" she whispered, tapping the glass. Bill, crouched beside her, shoved a lockpick into her palm. "Positive. Mrs. Henderson’s kid’s seven today. According to their Instagram, presents are already under that rainbow table." Molly groaned, eyeing the inflatable T-Rex guarding the back door. "That thing’s gonna be a problem. You got the whoopee cushions?" "Two dozen." Bill grinned, pulling a fluorescent green backpack from the shadows. "Once we plant those, nobody’s sitting still long enough to notice us." "Smooth." Molly stuffed the pick into her sleeve. "But what about the parent? I saw at least one adult inside." Bill’s grin faltered. "Shit, I forgot about them." He scratched his temple. "Right. Distraction. I’ll pose as a clown." Molly’s eye twitched. "That’s the worst idea you’ve ever had." "Relax, I’ll wear pants." He adjusted his stolen security badge. "You just get to the table." Molly sighed, watching a toddler smash cake onto a nearby face. "Fine. But if I get glitter in my bad ankle boots, I’m throwing you to the T-Rex." Bill winked. "Deal. Now let’s make this the most expensive seventh birthday ever."
created_at
"2026-03-18T10:33:24.754Z"
3/18/2026, 10:33:24 AM
model
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
476
moderation_latency
(null)
generation_time
3383
tokens_prompt
77
tokens_completion
357
native_tokens_prompt
81
native_tokens_completion
331
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
16
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00020859
router
(null)
provider_responses
0
endpoint_id
"f5ed6283-4e17-457d-98cd-2ec0c318171d"
id
"2ff0dfef636d4920a80d3ae6984028c0"
is_byok
false
latency
476
model_permaslug
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
provider_name
"Mistral"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830004-6TUu2uRFaHHONfN1VsCM"
upstream_id
"2ff0dfef636d4920a80d3ae6984028c0"
total_cost
0.00020859
cache_discount
0.00000216
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
215
97.76%
Dialogue to Total Word Ratio
Ratio: 53.88%, Deviation: 3.88%
neededClean
false
wordsTotal
219
wordsDialogue
118
79.0157%