NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0986EQFW9FA7QRYAW80P77
Run Details
Model:
Mistral Small 4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
51.5s
Cost:
$0.003442
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
42.22%
Dialogue to Total Word Ratio
(details)
21.1097%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain drummed against the windshield of their stolen getaway car—a vehicle they'd "borrowed" during what the dentist would later call "a moment of weakness." Inside, the dome light flickered like a horror movie effect as Molly sorted sticky notes labeled *Decorations*, *Food*, and *Presents (Sadly Mostly Useless)*. Bill sketched floor plans with the intensity of a man mapping escape routes from his own wedding. "I still say we hit the clown hire place," she muttered, eating a gummy bear that tasted like regret and artificial flavors. Bill didn’t look up. "Molly, we are professionals. We do not steal balloons, noses, or cheap plastic jewelry that’ll break in a week." She scoffed. "Breaking into a kid’s third birthday where the theme is *Dinosaurs*? We are officially the b-team of villainy." Bill tapped his pen against his teeth. "No security systems, no pressure plates, just one distracted magician and a room filled with sugar. I timed it: piñata smash marks our perfect escape window." Molly groaned. "There’s a *clown*." "And?" "And clowns *remember faces*! That’s literally their brand!" He tossed the napkin onto the dashboard. "Then we’ll wear ski masks. Or full balaclavas. Or we pretend to be entertainers too." She tore his napkin in half. "Fine. But if I’m stuck refereeing a screaming match over who gets the blue dinosaur while you haul loot to the car, I’m blaming you when glitter permanently stains my favorite sweater." Bill grinned. "Deal. And I call dibs on the Paw Patrol figurine." Molly nailed him with a gummy bear. "You can have it. I heard somewhere that collectible figurines lose value if you play with them anyway."
created_at
"2026-03-18T10:53:19.202Z"
3/18/2026, 10:53:19 AM
model
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
467
moderation_latency
(null)
generation_time
51416
tokens_prompt
77
tokens_completion
5539
native_tokens_prompt
81
native_tokens_completion
5717
native_tokens_completion_images
(null)
native_tokens_reasoning
5098
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00344235
router
(null)
provider_responses
0
endpoint_id
"f5ed6283-4e17-457d-98cd-2ec0c318171d"
id
"a62fbad5891d4b51a877fb6cc191fa1b"
is_byok
false
latency
454
model_permaslug
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
provider_name
"Mistral"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831199-vepAhdjUHJoewRY7kDqp"
upstream_id
"a62fbad5891d4b51a877fb6cc191fa1b"
total_cost
0.00344235
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
275
42.22%
Dialogue to Total Word Ratio
Ratio: 59.64%, Deviation: 9.64%
neededClean
false
wordsTotal
275
wordsDialogue
164
21.1097%