NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0ADQXYG6R2A20FP62PJ22B
Run Details
Model:
Mistral Small 4
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
3.3s
Cost:
$0.000186
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
6.09%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
3.0453%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain drummed against the roof of the stolen ice cream truck Molly and Bill had claimed as their temporary HQ. "I still think this is a stupid idea," Bill muttered, picking at the frostbite blisters on his fingers. Molly ignored him, unfolding a glittery birthday invitation printed at Kinko's. "Harold's turning four. Three-year-old nephew got a gigantic Elmo playset we could hawk for, like, three Benjamins." Bill groaned. "Ew. That plush red menace gives me the heebie-jeebies." "Sentimental value, Bill. Grandma paid retail. Also, rumor is Harold’s dad hides his keys in the *second* cookie jar labeled 'Emergency Snacks.'" Bill squinted at the yacht-blue diagram Molly drew of the McMansion across the street. "We tripping alarms, cameras, dads with .45s from their concealed-carry parent permit classes." Molly tossed him a fountain pen and a children's picture book titled "My First Crime Scene." "Dude, kids party’s gonna be chaos—cupcake avalanches, piñata punches, parents snapping pics of snotty Jonah. We blend right in. Dress like clowns." Bill shuddered. "Clowns. I'd rather face the dad with the .45."
created_at
"2026-03-18T11:13:49.516Z"
3/18/2026, 11:13:49 AM
model
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
378
moderation_latency
(null)
generation_time
3213
tokens_prompt
77
tokens_completion
296
native_tokens_prompt
81
native_tokens_completion
289
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00018555
router
(null)
provider_responses
0
endpoint_id
"f5ed6283-4e17-457d-98cd-2ec0c318171d"
id
"8cd168a03d8743ea866070ad25fafd04"
is_byok
false
latency
378
model_permaslug
"mistralai/mistral-small-2603"
1/1/2603, 12:00:00 AM
provider_name
"Mistral"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832429-yi1bOwyhISnHkX8DTN06"
upstream_id
"8cd168a03d8743ea866070ad25fafd04"
total_cost
0.00018555
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
6.09%
Matches word count
n/a
neededClean
false
words
177
0.00%
Dialogue to Total Word Ratio
Ratio: 59.89%, Deviation: 30.11%
neededClean
false
wordsTotal
182
wordsDialogue
109
3.0453%