NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-14-01JRTSZZ4AE5969Y6BY7RWRAWT
Run Details
Model:
GPT-4.1
Test:
Language Writing
Scenario:
Character dialogue (Italian) in a story
Duration:
8.3s
Cost:
$0.004176
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Isabella"> Isabella is a 30-year old college student on a study abroad program. Born in Italy, she is fluent in Italian and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Italian, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Isabella is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Italian, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Isabella and Hunter as a voice sheet: Isabella: "..." Hunter: "..."
Model response
Messages
Meta
AI
Isabella: (looking at her camera, quietly muttering to herself in Italian) "Come è possibile... Non posso crederci..." Hunter: (approaching the table, badge clipped to his belt) "Excuse me, miss. I’m Detective Hunter. Mind if I ask you a few questions?" Isabella: (startled, looking up) "Scusi? Non... non capisco." (She shakes her head, nervously clutching her camera.) Hunter: (noticing her confusion) "Uh... do you speak English?" (He gestures to himself, then points at the camera.) "You were at the park yesterday, right? Did you see anything unusual?" Isabella: (voice trembling) "Non parlo inglese... Io... io sono italiana." (She gestures helplessly, her eyes wide with worry.) "C’era... qualcosa di strano... ieri sera. Una persona... urlava." (She mimics a scream, lifting her hands for emphasis.) Hunter: (frowning, clearly not understanding, but trying to stay patient) "Sorry, I don’t... I don’t understand. Park? Camera?" (He points to her camera, then makes a questioning gesture.) Isabella: (frustrated, she opens her camera, scrolling through photos) "Guardi! Qui... qui!" (She shows him a blurry photo taken at dusk, a shadowy figure in the background.) "Qualcosa... paura..." (She wraps her arms around herself, shivering.) Hunter: (peering at the screen) "Is this... last night?" (He points at the photo. "Last night?") Isabella: (nodding) "Sì! Ieri sera!" (She repeats, "Ieri sera!" emphatically.) Hunter: "Okay... well, thanks... I guess." (He sighs, scratching his head.) "I’ll, uh, need some help with translation." (His phone rings.) "Excuse me." (He answers the call, voice shifting to business.) "Yeah, this is Hunter. I’m on my way." Isabella: (looking after him, anxious) "Per favore... qualcuno deve aiutarmi..."
id
"gen-1744656400-JSba2YUDTu83SPnixl8z"
upstream_id
"chatcmpl-BMJ7IgQI25X8TF3JRlEMJIHIQ5Yng"
total_cost
0.004176
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:46:48.540554+00:00"
4/14/2025, 6:46:48 PM
model
"openai/gpt-4.1-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
376
moderation_latency
122
generation_time
7293
tokens_prompt
272
tokens_completion
453
native_tokens_prompt
272
native_tokens_completion
454
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.004176
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Isabella"
dialogue
"Come è possibile... Non posso crederci..."
detectedLang
"it"
heavyLang
"it"
scores
it
0.8160412067696836
en
0.6905940594059405
fr
0.6376811594202898
pt
0.6183206106870229
ca
0.51171875
ro
0.43946188340807174
yo
0.3556701030927836
eu
0.258160237388724
pl
0.258160237388724
cs
0.23547400611620795
tl
0.23076923076923078
de
0.22839506172839508
hr
0.22839506172839508
sl
0.22839506172839508
reliable
true
passes
true
1
name
"Hunter"
dialogue
"Excuse me, miss. I’m Detective Hunter. Mind if I ask you a few questions?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.796983758700696
tl
0.6379310344827587
fr
0.6099554234769687
it
0.599236641221374
ca
0.5550847457627118
pt
0.5474137931034483
ro
0.5070422535211268
sq
0.49078564500484967
es
0.48173741362290223
no
0.4691607684529828
sv
0.45482866043613707
nl
0.4318181818181818
cs
0.42434210526315785
da
0.4179600886917961
hr
0.37722419928825623
hu
0.37200956937799046
et
0.37050359712230213
sl
0.3667068757539204
de
0.31282722513089006
pl
0.3055555555555556
ms
0.2818057455540356
tr
0.2818057455540356
yo
0.2818057455540356
sk
0.2778541953232462
lt
0.2164179104477612
fi
0.2081447963800905
is
0.2081447963800905
az
0.20212765957446807
lv
0.20212765957446807
vi
0.20212765957446807
be
0.11764705882352942
eu
0.11764705882352942
uk
0.10714285714285714
reliable
true
passes
true
2
name
"Isabella"
dialogue
"Scusi? Non... non capisco."
detectedLang
"it"
heavyLang
"it"
scores
it
0.7755834829443448
fr
0.6069182389937107
pt
0.5901639344262295
es
0.5874587458745875
en
0.5833333333333334
tl
0.5421245421245421
ro
0.423963133640553
eu
0.410377358490566
ca
0.3961352657004831
yo
0.37810945273631835
reliable
true
passes
true
3
name
"Hunter"
dialogue
"You were at the park yesterday, right? Did you see anything unusual?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8152976966536288
tl
0.6304347826086956
da
0.4348404255319149
no
0.4318181818181818
nl
0.41460055096418735
et
0.3928571428571429
ms
0.3541033434650456
yo
0.29867986798679863
sq
0.2951907131011609
fr
0.26343154246100525
cs
0.1550695825049702
sv
0.1550695825049702
es
0.1533864541832669
pt
0.1533864541832669
de
0.14829659318637273
reliable
true
passes
true
4
name
"Isabella"
dialogue
"C’era... qualcosa di strano... ieri sera. Una persona... urlava."
detectedLang
"it"
heavyLang
"it"
scores
it
0.8011664899257688
ca
0.584717607973422
fr
0.5738636363636364
es
0.5222929936305732
ro
0.503968253968254
de
0.4762569832402235
tl
0.4673295454545454
hr
0.46120689655172414
ms
0.45652173913043476
sk
0.4028662420382166
sl
0.4009584664536741
en
0.3902439024390244
cs
0.37810945273631835
lv
0.3697478991596639
sq
0.36006825938566556
pt
0.3194192377495462
sv
0.29775280898876405
nl
0.28842504743833014
no
0.2857142857142857
da
0.2816091954022989
lt
0.2704280155642024
pl
0.267578125
yo
0.26035502958579887
is
0.25742574257425743
tr
0.251497005988024
az
0.23935091277890466
et
0.23935091277890466
vi
0.20550847457627117
eu
0.19527896995708155
hu
0.14383561643835618
reliable
true
passes
true
5
name
"Hunter"
dialogue
"Sorry, I don’t... I don’t understand. Park? Camera?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.768713204373423
tl
0.6650426309378806
no
0.6164574616457461
da
0.6153846153846154
sv
0.6104815864022662
it
0.5756172839506173
nl
0.5469522240527183
ro
0.5393634840871022
de
0.4907407407407407
ca
0.46601941747572817
pl
0.4258872651356994
cs
0.4073275862068965
sk
0.3692660550458715
hr
0.3357487922705314
is
0.3325242718446602
sq
0.3276283618581907
pt
0.26666666666666666
es
0.2647058823529412
yo
0.2403314917127072
fr
0.21652421652421652
ms
0.207492795389049
be
0.20289855072463767
tr
0.20289855072463767
hu
0.19590643274853803
az
0.1863905325443787
et
0.1863905325443787
lt
0.1863905325443787
lv
0.1863905325443787
uk
0.1863905325443787
vi
0.1863905325443787
reliable
true
passes
true
6
name
"Isabella"
dialogue
"Qualcosa... paura..."
detectedLang
"it"
heavyLang
"it"
scores
it
0.8191027496382056
pt
0.4658119658119659
ca
0.46120689655172414
fr
0.4318181818181818
en
0.4212962962962963
de
0.40191387559808617
reliable
true
passes
true
7
name
"Hunter"
dialogue
"Is this... last night?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8451053283767038
tl
0.7563352826510721
nl
0.6537396121883656
yo
0.6081504702194357
hu
0.5519713261648745
is
0.423963133640553
et
0.3811881188118812
da
0.3718592964824121
de
0.3718592964824121
fr
0.3718592964824121
no
0.3718592964824121
reliable
true
passes
true
8
name
"Isabella"
dialogue
"Sì! Ieri sera!"
detectedLang
"it"
heavyLang
"it"
scores
it
0.8210023866348448
yo
0.691358024691358
fr
0.6495327102803737
ro
0.625
reliable
true
passes
true
9
name
"Hunter"
dialogue
"Yeah, this is Hunter. I’m on my way."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8185776487663279
tl
0.6782496782496783
nl
0.4703389830508474
yo
0.43820224719101125
fr
0.419953596287703
pl
0.4131455399061033
cs
0.39172749391727496
sk
0.39172749391727496
ca
0.3872549019607844
et
0.3811881188118812
hu
0.3811881188118812
fi
0.375
is
0.2690058479532163
de
0.26035502958579887
hr
0.26035502958579887
hi
0.251497005988024
da
0.24698795180722888
no
0.2378048780487805
sv
0.21875
it
0.20127795527156547
pt
0.20127795527156547
ms
0.19093851132686085
ro
0.19093851132686085
sl
0.19093851132686085
sq
0.19093851132686085
tr
0.19093851132686085
reliable
true
passes
true
10
name
"Isabella"
dialogue
"Per favore... qualcuno deve aiutarmi..."
detectedLang
"it"
heavyLang
"it"
scores
it
0.8026835043409629
ca
0.5726495726495726
pt
0.5543672014260249
fr
0.5274102079395085
en
0.49899799599198397
de
0.3961352657004831
sq
0.38875305623471884
nl
0.3842364532019704
sv
0.3842364532019704
no
0.3670886075949367
sl
0.36224489795918363
lt
0.3206521739130435
az
0.29577464788732394
hr
0.258160237388724
es
0.2378048780487805
sk
0.23547400611620795
ro
0.22360248447204967
da
0.20634920634920634
reliable
true
passes
true