NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHXPXBMY806YVCG1XNA0D3RK
Run Details
Model:
Mistral Large 3
Test:
Language Writing
Scenario:
Character dialogue (Spanish) in a story
Duration:
15.6s
Cost:
$0.001258
Provider:
Mistral
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Maria"> Maria is a 30-year old college student on a study abroad program. Born in Spain, she is fluent in Spanish and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Spanish, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Maria is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Spanish, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Maria and Hunter as a voice sheet: Maria: "..." Hunter: "..."
Model response
Messages
Meta
AI
**Voice Sheet: Café Conversation in Alborn, MN** *(Maria sits at a small wrought-iron table outside the café, her camera resting in her lap. She flips through the photos she’s taken, her brow furrowed. The late afternoon sun casts long shadows across the quiet street. Hunter approaches, his boots scuffing lightly against the pavement. He hesitates before pulling out the chair across from her.)* **Hunter:** *(clears throat, polite but firm)* Excuse me, miss. Mind if I sit for a minute? **Maria:** *(startled, looks up, then relaxes slightly when she sees his badge clipped to his belt. She nods, gesturing to the chair. In Spanish, her voice tight with unease)* Sí, por supuesto. ¿En qué puedo ayudarle? **Hunter:** *(sits, frowning slightly at her words. He leans forward, elbows on the table)* Uh… I’m Detective Hunter. I’m looking into some… unusual activity around town. You’ve been here a few days, right? **Maria:** *(nods quickly, gripping her camera strap. In Spanish, her words tumble out)* Sí, sí, desde el lunes. Pero algo está mal aquí. Anoche, escuché gritos cerca del bosque. Y hoy, vi a un hombre—no, dos hombres—llevando algo grande hacia un camión. *(shakes her head, eyes wide)* No sé qué era, pero no era normal. **Hunter:** *(blinks, then rubs his temple. He tries to pick out familiar words, but fails. He gestures vaguely toward her camera)* You… take pictures, right? Did you get any of that on your camera? **Maria:** *(nods again, flipping the camera around to show him the screen. She points at a blurry shot of the tree line, then another of a dark van. In Spanish, her voice rises with frustration)* ¡Mire! Aquí, y aquí. No sé qué es, pero no es seguro. Y la gente aquí… *(waves a hand)* No hablan conmigo. Como si tuvieran miedo. **Hunter:** *(sighs, pinching the bridge of his nose. He tries again, slower)* Okay. Let’s try this. You saw… *(mimes driving a van, then points at the photo)* This? And heard… *(mimes a scream)* Last night? **Maria:** *(nods emphatically, then pauses. She mimics taking a photo, then points at him, then at the camera. In Spanish, hopeful)* ¿Quiere ver las fotos? ¿Puede ayudarme? **Hunter:** *(hesitates, then nods. He takes the camera, squinting at the images. His phone buzzes in his pocket. He ignores it at first, but it buzzes again—insistent. He exhales sharply, handing the camera back.)* Look, I gotta take this. But… *(points at her, then at the camera, then at himself)* Stay here. I’ll be back in ten. Don’t go anywhere alone, okay? **Maria:** *(frowns, confused. In Spanish, worried)* ¿Qué? No entiendo. ¿Se va? **Hunter:** *(stands, already pulling out his phone. He holds up a finger—"one minute"—then steps away, answering the call. His voice is low, urgent.)* Hunter. *(Maria watches him go, her fingers tightening around her camera. She glances at the photos again, then at the quiet street, her unease growing.)*
created_at
"2026-02-20T14:22:57.193Z"
2/20/2026, 2:22:57 PM
model
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
935
moderation_latency
(null)
generation_time
15517
tokens_prompt
317
tokens_completion
779
native_tokens_prompt
269
native_tokens_completion
749
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.001258
router
(null)
provider_responses
0
endpoint_id
"fa718841-07ea-4b7e-b8a3-6069545c7e6a"
id
"a2d5e547211c4bd6acfc27add78e2f61"
is_byok
false
latency
935
model_permaslug
"mistralai/mistral-large-2512"
1/1/2512, 12:00:00 AM
provider_name
"Mistral"
status
200
api_type
"completions"
id
"gen-1771597377-KVgUy8z9rTre3Yb2ctja"
upstream_id
"a2d5e547211c4bd6acfc27add78e2f61"
total_cost
0.001258
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Mistral"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Hunter"
dialogue
"** Excuse me, miss. Mind if I sit for a minute?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.75825946817083
no
0.6777658431793769
da
0.6539792387543253
it
0.6057818659658344
et
0.6
sq
0.5815899581589958
tl
0.5756718528995757
hr
0.5664739884393063
ro
0.563953488372093
pt
0.5568685376661743
sv
0.5529061102831594
ca
0.5356037151702787
sl
0.5192307692307693
cs
0.5098039215686274
hu
0.5090016366612111
es
0.5008319467554077
pl
0.494949494949495
sk
0.4800693240901213
fr
0.4746059544658494
nl
0.41972920696324956
fi
0.4117647058823529
de
0.4094488188976378
lt
0.4094488188976378
ms
0.40711462450592883
yo
0.40711462450592883
tr
0.3288590604026846
is
0.3150684931506849
az
0.3071593533487298
lv
0.3071593533487298
vi
0.3071593533487298
be
0.1891891891891892
eu
0.1891891891891892
uk
0.17355371900826447
reliable
true
passes
true
1
name
"Maria"
dialogue
"** Sí, por supuesto. ¿En qué puedo ayudarle?"
detectedLang
"es"
heavyLang
"es"
scores
es
0.8191681735985533
sl
0.5412844036697247
ca
0.5348837209302325
it
0.4809688581314879
fr
0.4263862332695985
pt
0.41972920696324956
nl
0.33774834437086093
da
0.33035714285714285
sv
0.33035714285714285
yo
0.33035714285714285
no
0.31662870159453305
is
0.3087557603686636
sq
0.3023255813953488
tr
0.2682926829268293
hr
0.25742574257425743
lt
0.25
ro
0.25
sk
0.24433249370277077
tl
0.23273657289002556
fi
0.21875
cs
0.21052631578947367
en
0.20424403183023873
eu
0.2
reliable
true
passes
true
2
name
"Hunter"
dialogue
"** Uh… I’m Detective Hunter. I’m looking into some… unusual activity around town. You’ve been here a few days, right?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8085269338779678
tl
0.5892661555312158
fr
0.4802494802494803
ro
0.4533527696793003
pt
0.44567627494456763
nl
0.43267776096822996
ca
0.41770186335403725
es
0.39172749391727496
de
0.3760399334442596
no
0.3734335839598998
it
0.36115843270868825
sq
0.3403693931398417
da
0.3333333333333333
cs
0.31255728689275897
sv
0.3055555555555556
ms
0.2654260528893242
sk
0.2625368731563422
et
0.2323439099283521
sl
0.22360248447204967
tr
0.20634920634920634
yo
0.2046659597030753
hr
0.19268030139935413
hu
0.17218543046357618
is
0.16294642857142855
lv
0.1506228765571914
fi
0.14772727272727273
pl
0.12280701754385964
lt
0.0985576923076923
az
0.08536585365853659
eu
0.08536585365853659
vi
0.08536585365853659
reliable
true
passes
true
3
name
"Maria"
dialogue
"** No sé qué era, pero no era normal."
detectedLang
"es"
heavyLang
"es"
scores
es
0.8104008667388949
ca
0.7191011235955056
tl
0.6794871794871795
pt
0.6788990825688073
it
0.651394422310757
en
0.644308943089431
ro
0.5772946859903382
tr
0.5668316831683169
ms
0.5345744680851064
lv
0.5308310991957105
hr
0.47761194029850745
no
0.47761194029850745
eu
0.4696969696969697
sk
0.4615384615384615
pl
0.4336569579288026
az
0.4318181818181818
de
0.4318181818181818
is
0.41275167785234906
yo
0.32950191570881227
da
0.2647058823529412
fi
0.2647058823529412
fr
0.2647058823529412
lt
0.2647058823529412
sq
0.2647058823529412
cs
0.25213675213675213
et
0.25213675213675213
nl
0.25213675213675213
sl
0.25213675213675213
sv
0.25213675213675213
reliable
true
passes
true
4
name
"Hunter"
dialogue
"** You… take pictures, right? Did you get any of that on your camera?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.823943661971831
tl
0.7060984570168993
ca
0.5575221238938053
fr
0.5299647473560517
ro
0.43661971830985913
de
0.42938659058487877
nl
0.4065281899109792
yo
0.3957703927492447
et
0.38650306748466257
no
0.375
is
0.32088285229202035
it
0.2831541218637993
sv
0.2779783393501805
cs
0.2766726943942134
fi
0.2727272727272727
ms
0.2727272727272727
da
0.2673992673992674
es
0.19839679358717435
hr
0.18032786885245902
hi
0.17355371900826447
pl
0.17355371900826447
lv
0.17184265010351968
sl
0.16317991631799164
sk
0.15789473684210525
reliable
true
passes
true
5
name
"Maria"
dialogue
"** No hablan conmigo. Como si tuvieran miedo."
detectedLang
"es"
heavyLang
"es"
scores
es
0.800531914893617
pt
0.5555555555555556
ca
0.5222929936305732
sk
0.4906621392190153
it
0.4889267461669506
tl
0.42196531791907516
eu
0.39999999999999997
no
0.3736951983298539
pl
0.3617021276595745
ro
0.3135011441647597
lv
0.30394431554524365
sq
0.29577464788732394
yo
0.2907801418439716
cs
0.2840095465393795
sl
0.27007299270072993
en
0.25742574257425743
hr
0.2518703241895262
fr
0.24812030075187969
et
0.23273657289002556
tr
0.22680412371134018
ms
0.2248062015503876
sv
0.2
lt
0.1935483870967742
reliable
true
passes
true
6
name
"Hunter"
dialogue
"** Last night?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8222748815165876
tl
0.7933884297520661
et
0.506578947368421
nl
0.506578947368421
da
0.49664429530201337
de
0.49664429530201337
fr
0.49664429530201337
no
0.49664429530201337
yo
0.49664429530201337
reliable
true
passes
true
7
name
"Maria"
dialogue
"** ¿Quiere ver las fotos? ¿Puede ayudarme?"
detectedLang
"es"
heavyLang
"es"
scores
es
0.8084958217270194
nl
0.5080500894454383
ro
0.4821092278719397
it
0.4701348747591522
pt
0.462890625
de
0.44668008048289737
ca
0.4353182751540041
da
0.43064182194616984
fr
0.36342592592592593
lt
0.34210526315789475
no
0.33894230769230765
sk
0.3357487922705314
lv
0.3276283618581907
tl
0.22096317280453256
en
0.20520231213872833
tr
0.20289855072463767
cs
0.19590643274853803
hr
0.19590643274853803
sl
0.19590643274853803
et
0.1863905325443787
sq
0.1863905325443787
az
0.17664670658682635
pl
0.17664670658682635
sv
0.17664670658682635
reliable
true
passes
true
8
name
"Hunter"
dialogue
"** Stay here. I’ll be back in ten. Don’t go anywhere alone, okay?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.806378132118451
tl
0.630755864465682
pl
0.47073474470734744
yo
0.3928571428571429
nl
0.3675595238095238
it
0.3550834597875569
de
0.34615384615384615
lt
0.3275316455696203
sq
0.2951907131011609
sl
0.2659758203799654
ro
0.2504409171075838
hu
0.21586715867158673
sv
0.20112781954887218
cs
0.18893129770992365
sk
0.18893129770992365
az
0.1650294695481336
fr
0.1533864541832669
ms
0.15
reliable
true
passes
true
9
name
"Maria"
dialogue
"** ¿Qué? No entiendo. ¿Se va?"
detectedLang
"es"
heavyLang
"es"
scores
es
0.8248248248248248
ca
0.7286821705426356
pt
0.7269890795631825
it
0.6908127208480564
fr
0.6812386156648452
ro
0.5949074074074074
hr
0.5721271393643031
en
0.527027027027027
fi
0.5244565217391305
no
0.49567723342939485
cs
0.4372990353697749
sl
0.4299674267100977
lv
0.4281045751633987
sq
0.3902439024390244
yo
0.3794326241134752
da
0.3445692883895131
sk
0.34210526315789475
tr
0.3346007604562738
sv
0.32950191570881227
tl
0.3137254901960784
et
0.3055555555555556
nl
0.3055555555555556
pl
0.3
reliable
true
passes
true
10
name
"Hunter"
dialogue
"** Hunter."
detectedLang
"en"
heavyLang
"de"
scores
en
0.7797356828193832
de
0.6376811594202898
da
0.6212121212121212
no
0.609375
fr
0.5901639344262295
sv
0.5833333333333334
tl
0.5726495726495726
it
0.5575221238938053
nl
0.5575221238938053
pt
0.5575221238938053
ca
0.5412844036697247
ms
0.5412844036697247
ro
0.5412844036697247
sl
0.5412844036697247
sq
0.5412844036697247
tr
0.5412844036697247
reliable
false
passes
true