NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09VW0V9KDWCN6DJNK4ESDN
Run Details
Model:
Mistral Small 4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.9s
Cost:
$0.001214
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
23.84%
Dialogue to Total Word Ratio
(details)
11.9199%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the makeshift map on the cluttered table, the blueprint dotted with crayon scribbles in what looked like every color a child had ever set to paper. Across the dimly lit room, Bill was sharpening a lockpick with a file, tiny metal shavings piling up like grimy snow along the edge of the table. The safehouse smelled faintly of reheated pizza and outright desperation. They were planning to rob a children’s birthday party. Not just any party—one billed as "Jungle Adventure Extravaganza." The invite had been so naively detailed in its planning that Molly still questioned the sanity of her career path while she traced the layout with her finger. The "Treasure Cave" was marked in what could only be a child’s handwriting, right next to a parent’s note about "no peanuts due to allergies." Molly pressed her forehead against the fogged-up window, the glass cool against her skin. Outside, the neighborhood glittered with the glow of inflatable castles and the unmistakable sound of someone practicing "Hot Cross Buns" on a kazoo. "We’re really doing this, aren’t we?" she asked the empty room. The words sounded as foolish coming out of her mouth as the plan felt in her head. Bill didn’t look up from his lockpick. "Stealing a pile of bath bombs and Lego sets? Yeah, I’d say we’re *very* committed to the bit." Molly shot him a look. "What if there’s a clown?" Bill shrugged, the file still working rhythmically. "Then we improvise. We tell it we’re the hired entertainment—drywall repair *and* balloon animals. Win-win for everyone involved." She groaned. "That’s the worst plan you’ve ever had." Bill beamed. "Thank you." Their getup had been assembled from a thrift store down the street: a neon green vest that made Molly’s skin crawl just to touch it, and a nametag that read "HELEN (HR)" in glittery letters. She’d duct-taped it to Bill’s chest with the enthusiasm of a teacher preparing for a parent-teacher conference. "Helen from HR," she muttered, shoving him toward the door. "You’re Dave from IT." Bill glared at the nametag now. "I look like a Dave." "You look like someone who would embezzle petty cash. That’s close enough." He tugged at the fabric. "I feel like a man who got dressed in the dark and then had a stranger tell him what he looks like." Molly adjusted his tie—bright red, because Bill had insisted. "You’ll be fine. You’re charming. Mostly." The plan was simple: blend in as party staff, locate the Treasure Cave, fill two duffel bags with unopened presents, and slip out before the cake was cut. Simple. Bill slung one of the bags over his shoulder with his best impression of a guy who had never committed a crime in his life. "Piece of cake," he said. Molly’s stomach dropped. "Oh no." Bill turned to her, wide-eyed faux innocence. "What?" "The cake thing," she hissed. "You *know* what it sounds like." Bill blinked. "It’s a metaphor?" "—It’s the *actual cake sitting in the middle of the room!* Also, it’s peanut butter frosting, and I’m *severely* allergic." Bill’s face fell into genuine horror. "You’re allergic to peanut butter?" "Not *severely enough* to bail on the heist," she said quickly, "but *severely enough* to warrant a dramatic death scene if we’re still here when they start cutting that cake." Bill looked at the window, where the distant laughter of children wafted through the thin glass like a siren song of doom. "You might want to mention that *before* we break in." Molly exhaled sharply and pulled out her phone. "I need to text the group chat." Her thumbs flew across the screen. "Hey team, just a heads up—there’s a *peanut butter cake* at the target location and I *will* expire if I inhale it. Prepare for either my glorious sacrifice or my vengeful return." Bill grinned. "Nothing like peer pressure to keep us motivated." She tossed her phone back into her bag and zipped it shut. "Let’s go steal some toys."
created_at
"2026-03-18T11:04:03.88Z"
3/18/2026, 11:04:03 AM
model
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
378
moderation_latency
(null)
generation_time
19877
tokens_prompt
77
tokens_completion
2185
native_tokens_prompt
81
native_tokens_completion
2003
native_tokens_completion_images
(null)
native_tokens_reasoning
1145
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00121395
router
(null)
provider_responses
0
endpoint_id
"f5ed6283-4e17-457d-98cd-2ec0c318171d"
id
"3d3fe7ffc454461e9e6826279ac2dfe4"
is_byok
false
latency
372
model_permaslug
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
provider_name
"Mistral"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831843-wGOoTBFiZ1eNhicWhZL8"
upstream_id
"3d3fe7ffc454461e9e6826279ac2dfe4"
total_cost
0.00121395
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
671
23.84%
Dialogue to Total Word Ratio
Ratio: 35.36%, Deviation: 5.36%
neededClean
false
wordsTotal
676
wordsDialogue
239
11.9199%