NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0A48WX7YZQJ9N8CQ4FNM21
Run Details
Model:
Mistral Small 4
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
6.6s
Cost:
$0.000425
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
27.17%
Matches word count
(details)
0.60%
Dialogue to Total Word Ratio
(details)
13.8805%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the straps of her overalls with a sharp tug, her eyes locked on the distant glow of the birthday party lights through the trees. "You sure this is the place? It looks more like a bake sale at a retirement home than a kid’s party." Bill squinted, balancing a toolkit on his knee as he knelt beside her in the damp grass. "Trust me, Molly. I’ve scoped this party like a hawk scopes a field mouse. The parents are distracted by cake, the kids are hyper on sugar, and the presents? Just sitting there like lambs for the slaughter." Molly scoffed, kicking a rock loose with her boot. "Lambs? More like a minefield with glitter mines and police tape made of Play-Doh." Bill grinned, popping open the toolkit with a theatrical flourish. "That’s why I brought this." He held up a tiny remote control. "Motion sensor jammer. Deadens any squeaky toys, laser grids, or those infernal ‘Happy Birthday’ greeting cards that play music." Molly snatched the remote from his hand, turning it over skeptically. "And if it doesn’t work?" "Then we improvise," Bill said, already pulling on a pair of latex gloves. "Worst case, you distract the party with your classic ‘oops, I’m the clown’s long-lost cousin’ routine." Molly tossed the remote back at him. "I draw the line at face paint and a red nose. That’s where I draw the line." Bill zipped the toolkit shut with a satisfied click. "Fine, but if we get caught, I’m blaming you for the cyanide cupcakes you keep threatening to bring to our own parties." Molly waved him off, scanning the house again. "The presents are by the slide? Seriously?" "Yep. Parents think it’s cute. Keeps the little brats contained." Bill nudged her with his elbow. "You ever steal Christmas presents before?" Molly smirked. "Once. Took a Curious George stuffed animal from a five-year-old. Kid cried like I’d stolen his firstborn." She adjusted her cap. "Had to return it with an apology note and a new one. Name was embroidered on it already." Bill burst out laughing, then clapped a hand over his mouth when a floodlight flicked on in the backyard. The two froze, watching as a harried dad stumbled outside, juggling a beer and a tray of hot dogs. "Alright," Bill whispered. "Round one: Dad is now armed with condiments. Our move." Molly grabbed the toolkit, already loosening the cap on the jammer. "Condiments are the least of our worries. I’ve seen kids cause more damage with a juice box." Bill crept forward, then paused. "You know, if we’re successful, we should celebrate." Molly raised an eyebrow. "With pus—" "With actual cake," Bill interrupted. "Maybe a victory slice. We earned it." Molly considered it, then nodded. "Fine. But if there’s clown cake, I’m walking." Bill grinned. "Deal. No clown cake. Unless it’s a clown-shaped piñata. Then we take it."
created_at
"2026-03-18T11:08:39.211Z"
3/18/2026, 11:08:39 AM
model
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
352
moderation_latency
(null)
generation_time
6490
tokens_prompt
77
tokens_completion
746
native_tokens_prompt
81
native_tokens_completion
688
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00042495
router
(null)
provider_responses
0
endpoint_id
"f5ed6283-4e17-457d-98cd-2ec0c318171d"
id
"06d6e77fa2bf49ff850c941d976f9499"
is_byok
false
latency
352
model_permaslug
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
provider_name
"Mistral"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832119-hAEvYVNTEnmNU2U0bn90"
upstream_id
"06d6e77fa2bf49ff850c941d976f9499"
total_cost
0.00042495
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
27.17%
Matches word count
n/a
neededClean
false
words
481
0.60%
Dialogue to Total Word Ratio
Ratio: 59.88%, Deviation: 10.12%
neededClean
false
wordsTotal
486
wordsDialogue
291
13.8805%