NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM08NC9BAEXHN13NSF5P6T7N
Run Details
Model:
Mistral Small 4
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
8.9s
Cost:
$0.000428
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
1.05%
Dialogue to Total Word Ratio
(details)
30.6610%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slammed the car door shut with her elbow, juggling two coffees and a greasy bag of donuts. "Alright, Bill, I got us sugar and caffeine. Now we just need brains and cunning." Bill squinted at the glowing house across the street where a dozen kids in dinosaur costumes were running around screaming. A piñata shaped like a unicorn hung from a tree, looking suspiciously like a vault with candy instead of gold. "Brains? I had those yesterday. Cunning? That’s your department." Molly took a sip, scalding her tongue. "Ow. Damn it. Okay, new plan. We go in there like we’re parents. You, the clueless dad. Me, the frazzled mom who just wants to ‘mingle.’" Bill adjusted his tie—the only one he owned. "I play a clueless dad? I can do clueless. I once tried to parallel park for twenty minutes in front of a cop." "You’ll be fine," Molly said. "Just don’t mention the time you got arrested for ‘borrowing’ a golf cart in college." Bill’s face went pale. "You wouldn’t." "Try me. Now, once we’re inside, you create a distraction—knock over that unicorn piñata." Bill gasped. "That piñata is a child’s dream!" "Exactly. It’s guarded by emotion. Emotion is weak. Look at these cars—BMWs, Teslas, a minivan with a baby on the dash. People here have money. And money means presents. Presents mean stuff we can fence." Bill rubbed his temples. "I can’t believe I’m about to be an agent of chaos at a child’s birthday party." Molly tossed him a toy dinosaur costume from the back seat. "Here. Wear this. Instant credibility." Bill stared at the tiny orange T-rex suit. "I’m six-foot-four. I will look ridiculous." "You’ll look like a man who loves dinosaurs. Trust me, no one questions a man in a T-rex suit." She grabbed her purse, which was suspiciously shaped like a bowling ball. "Come on, Bill. We break in, we fill this bad boy with loot, and we’re out before the cake is even cut." Bill sighed as he squeezed into the costume. His knees were at his chest, and his tail kept whacking Molly in the face every time she turned. "This is the worst plan you’ve ever had." Molly adjusted her sunglasses—despite it being 7 PM. "No, remember that time you tried to rob the bakery wearing a chef’s hat? This is subtle." They started across the street. A kid in a shark costume pointed and screamed, "Mom! There’s a T-rex and a lady with sunglasses at night!" Molly waved. "Hi, honey! Fun party!" Bill whispered through his mask, which smelled like old gym socks. "We are so getting caught." Molly gripped her bowling bag purse tighter. "Not if you knock over that piñata like a pro. Then we disappear into the night with enough toys to open our own black-market daycare." Bill tripped over his own tail. "I hate this plan."
created_at
"2026-03-18T10:43:02.584Z"
3/18/2026, 10:43:02 AM
model
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
363
moderation_latency
(null)
generation_time
8732
tokens_prompt
77
tokens_completion
735
native_tokens_prompt
81
native_tokens_completion
693
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00042795
router
(null)
provider_responses
0
endpoint_id
"f5ed6283-4e17-457d-98cd-2ec0c318171d"
id
"c841dd9440b84bcdb049908c331fa451"
is_byok
false
latency
363
model_permaslug
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
provider_name
"Mistral"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830582-9JVEO4tcKRZ51PknEPMa"
upstream_id
"c841dd9440b84bcdb049908c331fa451"
total_cost
0.00042795
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
1.05%
Dialogue to Total Word Ratio
Ratio: 60.45%, Deviation: 9.55%
neededClean
false
wordsTotal
488
wordsDialogue
295
30.6610%