NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPZG39N70WY9EG9TZ7XXV0
Run Details
Model:
Mistral Small 3.2 24B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
6.8s
Cost:
$0.000169
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0002%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the oversized sunglasses perched on her nose and squinted at the brightly colored banner fluttering in the breeze. "You sure this is the right place, Bill? It looks like a rainbow exploded on a cupcake." Bill, balancing a stack of stolen party supplies in his arms, grinned. "Positive. The kid’s parents are loaded. Presents will be top-tier. Maybe even a drone. Or a pony. You never know with these rich folks." Molly snorted. "A pony? You think they’re just gonna leave a pony under the tree?" "Stranger things have happened," Bill said, shifting the pile. "Besides, we’re not here for the pony. We’re here for the *value*. And the cake. That cake looks illegal." Molly peeked over the hedge, eyeing the backyard where a dozen children shrieked and chased each other with inflatable hammers. "How do we even get in? The place is swarming with tiny humans." Bill tapped his chin. "Distraction. Classic move. You go first. Flirt with the dad. Tell him you’re a long-lost cousin from Canada." Molly groaned. "I’m not doing your accent again. Last time, I got kicked out of a bingo hall." "Fine, fine. Then we go with Plan B. You cause a scene. Fake a seizure or something." Molly crossed her arms. "I am *not* faking a seizure." "Okay, okay. How about this—you ‘accidentally’ trip into the gift table. Knock it over. Chaos ensues. We grab the loot." Molly considered it. "That’s… actually not terrible. But what if someone recognizes us?" Bill waved a hand. "Pfft. Who’s gonna recognize us? We’re in disguise." Molly looked down at her outfit—a neon pink tutu over her black pants, a tiara, and a name tag that read "Auntie Molly." "This is the worst disguise ever." "Exactly," Bill said. "No one expects the thief to be the one wearing a tutu. It’s genius." Molly sighed. "Alright, fine. But if this goes south, I’m blaming you." Bill clapped her on the shoulder. "Deal. Now, on three, you go in. One… two…" Molly took a deep breath and stepped forward—only to trip over a garden gnome. Bill facepalmed. "Or we could just go with the accident plan."
created_at
"2026-02-20T14:24:07.283Z"
2/20/2026, 2:24:07 PM
model
"mistralai/mistral-small-3.2-24b-instruct-2506"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
323
moderation_latency
(null)
generation_time
6658
tokens_prompt
77
tokens_completion
560
native_tokens_prompt
69
native_tokens_completion
541
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0001692
router
(null)
provider_responses
0
endpoint_id
"23fd6b91-74d8-45fc-ac00-cf141ec7f4a5"
id
"fece46a7c6b54509b963fb6ababfea2e"
is_byok
false
latency
323
model_permaslug
"mistralai/mistral-small-3.2-24b-instruct-2506"
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771597447-Hq6nVIvK5q0Ndrzg3k2W"
upstream_id
"fece46a7c6b54509b963fb6ababfea2e"
total_cost
0.0001692
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
364
0.00%
Dialogue to Total Word Ratio
Ratio: 65.66%, Deviation: 15.66%
neededClean
false
wordsTotal
364
wordsDialogue
239
0.0002%