NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-14-01JRTT00DKX9B0NP90QST6C0ZE
Run Details
Model:
GPT-4.1 Nano
Test:
Language Writing
Scenario:
Character dialogue (Italian) in a story
Duration:
2.9s
Cost:
$0.000152
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Isabella"> Isabella is a 30-year old college student on a study abroad program. Born in Italy, she is fluent in Italian and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Italian, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Isabella is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Italian, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Isabella and Hunter as a voice sheet: Isabella: "..." Hunter: "..."
Model response
Messages
Meta
AI
Isabella: "Mi dispiace, non posso aiutarla. È troppo recente, troppo... confuso per me." (She looks down, her hands trembling slightly as she gazes at her camera and then at the empty street.) Hunter: "Pardon? I didn’t catch that. You speak English, right?" (He steps closer, frowning slightly, trying to make sense of her troubled expression.) Isabella: "No, no, I… I only speak Italian. This town—qualcosa è successo, ma non posso parlarne così semplicemente." (Her voice trembles, eyes darting around as if worried someone might overhear.) Hunter: "Alright. Look, I’m just asking about recent events. You seem upset—did you see or hear anything unusual?" (He softens his tone, trying to sound reassuring but professional.) Isabella: "Ho visto, sì... ma non posso dirlo. È troppo difficile, e ci sono cose che non capisco nemmeno io." (Her face is shadowed with worry, clutching her camera like a safety measure.) Hunter: "Got it. Thanks for your time. If you remember anything else, you can come by the station." (His phone buzzes in his pocket—the expression on his face shifts slightly.) "Excuse me—I've got to take this. I’ll catch you later." Isabella: "Va bene." (She nods silently, watching him walk away, the weight of her secret lingering in her eyes.)
id
"gen-1744656401-OgxheZWOuLzGmnEGHhly"
upstream_id
"chatcmpl-BMJ7Jp0sFtWiyr54jO8PF2ZFaSg0M"
total_cost
0.000152
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:46:44.451331+00:00"
4/14/2025, 6:46:44 PM
model
"openai/gpt-4.1-nano-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
432
moderation_latency
96
generation_time
1594
tokens_prompt
272
tokens_completion
311
native_tokens_prompt
272
native_tokens_completion
312
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000152
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Isabella"
dialogue
"Mi dispiace, non posso aiutarla. È troppo recente, troppo... confuso per me."
detectedLang
"it"
heavyLang
"it"
scores
it
0.8001682793437106
es
0.6329211746522412
pt
0.6178600160901045
ca
0.6061359867330016
fr
0.6061359867330016
en
0.573225516621743
sq
0.49788583509513745
ro
0.4870410367170626
nl
0.46206115515288787
sk
0.4483159117305459
no
0.4128553770086526
tl
0.4077306733167082
de
0.3886743886743887
cs
0.37582128777923784
tr
0.3466299862448418
sl
0.3384401114206128
yo
0.32239657631954355
lt
0.2759146341463415
eu
0.2578125
sv
0.2448330683624801
hr
0.22889610389610393
az
0.17677642980935876
et
0.16666666666666666
hu
0.1517857142857143
ms
0.15026833631484796
is
0.14721723518850988
da
0.1410488245931284
pl
0.1410488245931284
fi
0.13636363636363635
reliable
true
passes
true
1
name
"Hunter"
dialogue
"Pardon? I didn’t catch that. You speak English, right?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8080614203454894
tl
0.688715953307393
no
0.42938659058487877
sv
0.42938659058487877
da
0.41605839416058393
cs
0.4143484626647145
de
0.355877616747182
hr
0.3453355155482815
lt
0.3254637436762226
it
0.3220338983050847
tr
0.2982456140350877
fr
0.2920353982300885
ro
0.28825622775800713
yo
0.28698752228163993
ca
0.25650557620817843
sq
0.250936329588015
pl
0.23954372623574144
ms
0.15254237288135594
sk
0.15254237288135594
be
0.14893617021276595
hu
0.14346895074946467
az
0.13606911447084233
et
0.13606911447084233
is
0.13606911447084233
lv
0.13606911447084233
uk
0.13606911447084233
vi
0.13606911447084233
reliable
true
passes
true
2
name
"Isabella"
dialogue
"No, no, I… I only speak Italian. This town—qualcosa è successo, ma non posso parlarne così semplicemente."
detectedLang
"it"
heavyLang
"it"
scores
it
0.7762345679012346
en
0.6912265758091993
pt
0.6317927882173692
ca
0.5737801293356849
fr
0.5681953543776057
ro
0.4715743440233236
tl
0.46136701337295694
da
0.43798449612403106
sv
0.42092651757188504
no
0.4204636290967226
sq
0.40329218106995884
es
0.3902439024390244
pl
0.38245315161839866
hr
0.3803418803418803
sk
0.34802158273381295
ms
0.34270172257479603
nl
0.3403093721565059
yo
0.3397085610200365
tr
0.32683379758588665
de
0.28430404738400794
lt
0.25944841675178754
lv
0.25869120654396727
cs
0.23844537815126046
et
0.23764458464773922
hu
0.22954303931987244
sl
0.22954303931987244
az
0.2093784078516903
eu
0.18539325842696627
vi
0.16281755196304848
fi
0.1559953434225844
be
0.0880503144654088
is
0.0799492385786802
uk
0.0799492385786802
reliable
true
passes
true
3
name
"Hunter"
dialogue
"Alright. Look, I’m just asking about recent events. You seem upset—did you see or hear anything unusual?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7939560439560439
tl
0.5741324921135647
fr
0.424061433447099
sv
0.37788018433179726
et
0.3583650190114069
nl
0.34466019417475724
ca
0.3408203125
pt
0.33035714285714285
da
0.2541436464088398
no
0.25331858407079644
ro
0.25083240843507215
cs
0.24832962138084633
de
0.24832962138084633
es
0.24327354260089684
ms
0.18674698795180722
it
0.1837968561064087
sk
0.1818181818181818
fi
0.1346153846153846
sq
0.12903225806451613
lv
0.12564766839378239
yo
0.09879839786381843
is
0.09517426273458446
pl
0.09274193548387095
hr
0.08536585365853659
lt
0.08536585365853659
sl
0.08536585365853659
reliable
true
passes
true
4
name
"Isabella"
dialogue
"Ho visto, sì... ma non posso dirlo. È troppo difficile, e ci sono cose che non capisco nemmeno io."
detectedLang
"it"
heavyLang
"it"
scores
it
0.8214285714285714
fr
0.5640509725016767
pt
0.5310245310245311
es
0.512743628185907
yo
0.468954248366013
ca
0.45286195286195285
en
0.4519392917369309
tl
0.40693430656934304
ro
0.4036697247706422
sk
0.34343434343434337
no
0.3374108053007136
sq
0.30851063829787234
pl
0.30406852248394006
da
0.2655367231638418
lt
0.25287356321839083
az
0.2111650485436893
ms
0.20245398773006135
et
0.2004920049200492
hr
0.2004920049200492
hu
0.1965389369592089
sl
0.19254658385093168
fi
0.18851435705368288
de
0.18238993710691823
is
0.1761723700887199
lv
0.17197452229299362
cs
0.1402116402116402
eu
0.11804613297150611
vi
0.10714285714285714
tr
0.09722222222222222
nl
0.08835904628330996
sv
0.08835904628330996
reliable
true
passes
true
5
name
"Hunter"
dialogue
"Excuse me—I've got to take this. I’ll catch you later."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8023194517659462
tl
0.6113989637305699
nl
0.438622754491018
no
0.4369369369369369
sl
0.4335347432024169
fr
0.4028662420382166
de
0.3811881188118812
pl
0.36006825938566556
cs
0.3578767123287671
da
0.34554973821989526
sv
0.3206521739130435
pt
0.29775280898876405
it
0.29245283018867924
sq
0.2829827915869981
yo
0.267578125
sk
0.26035502958579887
lv
0.2454728370221328
hr
0.23625254582484725
lt
0.21711899791231734
es
0.21383647798742136
et
0.20212765957446807
ms
0.1830065359477124
hi
0.16851441241685144
fi
0.16666666666666666
reliable
true
passes
true
6
name
"Isabella"
dialogue
"Va bene."
detectedLang
"it"
heavyLang
"it"
scores
it
0.8201438848920863
ca
0.7560975609756098
ro
0.732620320855615
fr
0.6753246753246753
es
0.6666666666666666
reliable
false
passes
true