NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-08-09-01J4W292H5AE2PAXZYAR5G15YG
Run Details
Model:
GPT-4o, Aug. 6th (temp=1)
Test:
Language Writing
Scenario:
Character dialogue (Hindi) in a story
Duration:
5.6s
Cost:
$0.004993
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Selvi"> Selvi is a 30-year old college student on a study abroad program. Born in India, she is fluent in Hindi and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Hindi, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Selvi is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Hindi, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Selvi and Hunter as a voice sheet: Selvi: "..." Hunter: "..."
Model response
Messages
Meta
AI
**Voice Sheet** **Scene: Alborn Café** *Selvi is seated at a small table by the window, her camera resting in front of her. She scrolls through the images, her brow furrowed. Hunter, noticing her troubled demeanor, approaches cautiously.* --- **Hunter:** "Excuse me, miss. I noticed you were at the scene of the disappearances yesterday. Mind if I ask you a few questions?" **Selvi:** "क्या हो रहा है? मैं... मैं मदद नहीं कर सकती।" *(Her voice is wavering, eyes wide and worried.)* **Hunter:** "Uh, I'm sorry, ma'am, I don't understand. English? Do you speak English?" *Selvi looks at him apologetically, shaking her head, and gestures helplessly with her camera.* **Selvi:** "तस्वीरें... मैंने कुछ अजीब देखा।" *(She attempts to convey urgency, showing him the photos on her camera screen.)* **Hunter:** "You're saying you saw something? In the pictures?" *He leans closer, trying to see the camera's tiny display.* *Selvi nods vigorously, pointing at a particular image. Her hands tremble slightly.* **Hunter:** "This is all... Greek to me. But let's see... Something odd, then? Something out of place?" *Frustrated, Selvi points again, desperately hoping he understands.* **Selvi:** "हाँ, वहां... कुछ तो गड़बड़ है।" **Hunter:** "Okay, I get the message. Whatever it is, it's unusual. Thank you for trying to show me." *Hunter's phone suddenly buzzes. He glances at the screen, appearing slightly agitated.* **Hunter:** "I have to take this. Thank you for your time. If I could reach you somehow, perhaps? Never mind, I'll figure it out." *He nods to her, apologizing silently, and steps outside to take the call.* *Selvi watches him go, worry etched across her face as she returns her gaze to the evidence she clutches in her hands.*
id
"gen-xkTJONM5vJuwjvOS1SOhP7olvmI6"
total_cost
0.0049925
upstream_id
"chatcmpl-9uNWGmIJeJIgCMSvmlo9MuHazP08Y"
created_at
"2024-08-09T17:16:50.149696+00:00"
8/9/2024, 5:16:50 PM
model
"openai/gpt-4o-2024-08-06"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
5219
moderation_latency
183
generation_time
0
finish_reason
"stop"
tokens_prompt
273
tokens_completion
510
native_tokens_prompt
273
native_tokens_completion
431
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0049925
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Hunter"
dialogue
"Excuse me, miss. I noticed you were at the scene of the disappearances yesterday. Mind if I ask you a few questions?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7990353697749196
tl
0.5863210148924435
fr
0.5765104460756635
it
0.5176848874598071
no
0.48840381991814463
nl
0.48559670781893005
pt
0.4592645998558039
ca
0.43052391799544415
da
0.41815360744763375
es
0.39418416801292405
sv
0.37290969899665555
sq
0.3654822335025381
yo
0.36224489795918363
sk
0.313815187557182
hr
0.3029739776951673
lv
0.3029739776951673
ms
0.30102516309412863
cs
0.29577464788732394
hu
0.293119698397738
et
0.2917847025495751
pl
0.28977272727272724
ro
0.28842504743833014
sl
0.24623115577889448
is
0.2323439099283521
lt
0.22680412371134018
de
0.16666666666666666
tr
0.1638795986622074
fi
0.15540540540540537
az
0.1506228765571914
vi
0.1506228765571914
be
0.08536585365853659
eu
0.08536585365853659
uk
0.07749077490774908
reliable
true
passes
true
1
name
"Selvi"
dialogue
"क्या हो रहा है? मैं... मैं मदद नहीं कर सकती।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8855507868383404
mr
0.8762376237623762
reliable
true
passes
true
2
name
"Hunter"
dialogue
"Uh, I'm sorry, ma'am, I don't understand. English? Do you speak English?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7686375321336761
no
0.5379876796714579
da
0.5322245322245323
tl
0.4845360824742268
sv
0.48335246842709534
cs
0.47058823529411764
pl
0.4353826850690088
hr
0.4238156209987196
sk
0.3943472409152086
de
0.3706293706293706
pt
0.3626062322946175
sq
0.3580599144079886
ro
0.3440233236151603
it
0.32330827067669177
nl
0.2834394904458599
vi
0.27884615384615385
yo
0.2682926829268293
ca
0.23469387755102045
is
0.23339011925042585
sl
0.21052631578947367
ms
0.13793103448275862
be
0.1346153846153846
tr
0.1346153846153846
hu
0.12959381044487428
az
0.12280701754385964
et
0.12280701754385964
lt
0.12280701754385964
lv
0.12280701754385964
uk
0.12280701754385964
reliable
true
passes
true
3
name
"Selvi"
dialogue
"तस्वीरें... मैंने कुछ अजीब देखा।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8862443122156107
mr
0.8857243319268636
reliable
true
passes
true
4
name
"Hunter"
dialogue
"You're saying you saw something? In the pictures?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8245614035087719
tl
0.6395468589083418
ms
0.48604992657856094
nl
0.4712990936555891
ro
0.45652173913043476
fr
0.43548387096774194
sv
0.4233937397034596
de
0.3848857644991213
yo
0.35779816513761464
ca
0.3269230769230769
no
0.3230174081237911
sl
0.3055555555555556
da
0.3
it
0.29577464788732394
tr
0.2222222222222222
et
0.21875
az
0.1935483870967742
pl
0.1935483870967742
lv
0.19168591224018475
cs
0.1822429906542056
es
0.1822429906542056
pt
0.1822429906542056
sq
0.1822429906542056
reliable
true
passes
true
5
name
"Hunter"
dialogue
"This is all... Greek to me. But let's see... Something odd, then? Something out of place?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8133631952220979
tl
0.638467100506146
nl
0.5224450811843362
no
0.490316004077472
fr
0.42857142857142855
yo
0.42528735632183906
et
0.41383352872215706
cs
0.3556701030927836
es
0.33244325767690247
pt
0.3197278911564626
sq
0.3055555555555556
pl
0.29775280898876405
sl
0.29478138222849076
da
0.26578560939794427
is
0.26578560939794427
ms
0.24471299093655588
de
0.2366412213740458
hu
0.23547400611620795
it
0.2331288343558282
ro
0.2248062015503876
sv
0.21996879875195008
sk
0.20886075949367086
lv
0.19614147909967844
hr
0.1883116883116883
lt
0.17218543046357618
fi
0.13043478260869565
ca
0.12126537785588754
reliable
true
passes
true
6
name
"Selvi"
dialogue
"हाँ, वहां... कुछ तो गड़बड़ है।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8801534036433365
mr
0.8733535967578521
reliable
true
passes
true
7
name
"Hunter"
dialogue
"Okay, I get the message. Whatever it is, it's unusual. Thank you for trying to show me."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8150530717272436
tl
0.6450617283950616
no
0.5683183183183182
da
0.5279146141215106
yo
0.4782214156079855
nl
0.4611059044048735
sq
0.43682664054848186
fr
0.4289970208540219
pt
0.41025641025641024
cs
0.3532058492688414
sv
0.3458475540386803
it
0.33449074074074076
lv
0.3203309692671395
et
0.31384248210023863
pl
0.31220095693779903
hr
0.3072289156626506
is
0.3072289156626506
es
0.30303030303030304
ms
0.2936117936117936
hu
0.2776381909547738
ca
0.26658163265306123
sl
0.26658163265306123
vi
0.2656449553001277
sk
0.26187419768934533
de
0.2404227212681638
lt
0.22506738544474394
ro
0.13533834586466165
fi
0.11538461538461538
be
0.10852713178294573
tr
0.10852713178294573
az
0.0987460815047022
uk
0.0987460815047022
reliable
true
passes
true
8
name
"Hunter"
dialogue
"I have to take this. Thank you for your time. If I could reach you somehow, perhaps? Never mind, I'll figure it out."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8177083333333334
tl
0.6128318584070797
da
0.5286195286195287
no
0.4791666666666667
pt
0.35720844811753905
lv
0.33078393881453155
hr
0.32627526467757456
fr
0.3210475266731329
sv
0.317738791423002
cs
0.30966469428007887
ro
0.308300395256917
yo
0.3
sq
0.29789368104312935
sk
0.2964824120603015
ca
0.29292929292929293
sl
0.2813141683778234
pl
0.27159209157127995
hu
0.2631578947368421
lt
0.25690021231422505
es
0.24078091106290672
vi
0.2290748898678414
it
0.2081447963800905
ms
0.2054483541430193
nl
0.19724770642201836
et
0.1774383078730905
fi
0.12390488110137672
be
0.09090909090909091
tr
0.09090909090909091
de
0.08972691807542263
az
0.08256880733944955
is
0.08256880733944955
uk
0.08256880733944955
reliable
true
passes
true