NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM08MW5X2K8YXK090VMZY4K7
Run Details
Model:
Mistral Small 4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
10.9s
Cost:
$0.000880
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.03%
Dialogue to Total Word Ratio
(details)
0.0153%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly sipped her coffee and unfolded a crumpled map of the community center where little Timmy Johnson was throwing his seventh birthday party. Across the table, Bill twirled a stolen fidget spinner he'd liberated from a kid earlier that day. "Alright, here's the deal," Molly said, tapping the map. "The presents are in the big hall by the piñata. No security except one overworked dad whose idea of surveillance is yelling 'no running' every five minutes." Bill squinted at the map. "And we're just gonna walk in, stuff our bags, and walk out? Seems too easy." Molly rolled her eyes. "Bill, it's a children's party. The most dangerous thing in there is the cake that may or may not have been baked by someone's aunt with a suspended food handler's permit. Speaking of which, I drew up our disguises." She slid a sheet of paper toward him featuring questionable crayon drawings. Bill peered at the paper. "We're dressing as clowns? Molly, last time we dressed as clowns, you sneezed and knocked over three shelves at the hardware store." "That was one time," Molly protested. "And it was ragweed season! Besides, clowns are perfect. Kids love clowns, parents trust clowns, and most importantly, clowns can get away with carrying large bags under those billowy sleeves." Bill sighed and rubbed his temples. "Fine. But when we get caught by some mom who realizes we're not on the approved entertainment list, I'm blaming you." "Deal. You distract while I scoop up the loot. We'll use the fire exit by the juice boxes—least likely to be guarded since everyone's distracted by the magician." Bill nodded slowly. "And if the magician catches on?" Molly grinned. "Then you use your special skill. Remember Vegas?" Bill groaned. "I still have nightmares about that pigeon." They clinked their coffee cups together, the ice in Molly's rattling ominously, a sound Bill had learned over the years meant trouble. But he joined the cheer anyway. What were the chances this party would go anything like they planned?
created_at
"2026-03-18T10:42:46.095Z"
3/18/2026, 10:42:46 AM
model
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
383
moderation_latency
(null)
generation_time
10811
tokens_prompt
77
tokens_completion
1611
native_tokens_prompt
81
native_tokens_completion
1446
native_tokens_completion_images
(null)
native_tokens_reasoning
1089
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00087975
router
(null)
provider_responses
0
endpoint_id
"f5ed6283-4e17-457d-98cd-2ec0c318171d"
id
"6333da98975543dcb87c805020906bfc"
is_byok
false
latency
376
model_permaslug
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
provider_name
"Mistral"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830566-pCqH5gLnb8cRugmmT1LR"
upstream_id
"6333da98975543dcb87c805020906bfc"
total_cost
0.00087975
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
338
0.03%
Dialogue to Total Word Ratio
Ratio: 62.72%, Deviation: 12.72%
neededClean
false
wordsTotal
338
wordsDialogue
212
0.0153%