NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-08-09-01J4W2A8VBCDJ7ZT5CJHVAKT8C
Run Details
Model:
GPT-4o, Aug. 6th (temp=0)
Test:
Language Writing
Scenario:
Character dialogue (Hindi) in a story
Duration:
5.2s
Cost:
$0.005173
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Selvi"> Selvi is a 30-year old college student on a study abroad program. Born in India, she is fluent in Hindi and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Hindi, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Selvi is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Hindi, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Selvi and Hunter as a voice sheet: Selvi: "..." Hunter: "..."
Model response
Messages
Meta
AI
**Voice Sheet:** **[Scene: A quaint cafe in Alborn, MN. The aroma of freshly brewed coffee fills the air. Selvi sits at a corner table, her camera resting on the table as she scrolls through her recent photographs, a troubled expression on her face. Hunter, the town's detective, approaches her table.]** **Hunter:** "Excuse me, miss. I'm Detective Hunter. I noticed you were at the scene of the recent event. Can I ask you a few questions?" **Selvi:** "मुझे नहीं पता कि क्या हो रहा है। यह सब बहुत डरावना था।" (I don't know what's happening. It was all very scary.) **[Hunter looks puzzled, trying to make sense of her words.]** **Hunter:** "I'm sorry, I don't understand. Do you speak any English?" **Selvi:** "नहीं, मैं अंग्रेजी नहीं बोलती।" (No, I don't speak English.) **[Hunter scratches his head, trying to think of a way to communicate.]** **Hunter:** "Okay, um... pictures? Can I see your camera?" **Selvi:** "तस्वीरें? हाँ, यहाँ हैं।" (Pictures? Yes, here they are.) **[Selvi hands over her camera hesitantly, showing him the photos she took. Hunter flips through them, trying to find any clues.]** **Hunter:** "These are good... but I need to know what you saw. Can you... draw? Or write?" **Selvi:** "मैं लिख सकती हूँ।" (I can write.) **[Selvi takes out a notebook and starts sketching something quickly, her hands slightly trembling. She shows it to Hunter, who studies it intently.]** **Hunter:** "This is helpful. Thank you. I wish I could understand more." **[Just then, Hunter's phone rings. He answers it, his expression turning serious.]** **Hunter:** "I have to go. Thank you for your help. Stay safe." **[Hunter nods at Selvi, who gives a small, worried smile. He leaves the cafe, leaving Selvi to her thoughts and her camera.]**
id
"gen-tTsH3V5MJAFpRYm8upITrw9kcuIT"
total_cost
0.0051725
upstream_id
"chatcmpl-9uNWuGrLdXXaiBHK3QCAYcbTSbFth"
created_at
"2024-08-09T17:17:28.989366+00:00"
8/9/2024, 5:17:28 PM
model
"openai/gpt-4o-2024-08-06"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
4841
moderation_latency
166
generation_time
0
finish_reason
"stop"
tokens_prompt
273
tokens_completion
537
native_tokens_prompt
273
native_tokens_completion
449
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0051725
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Hunter"
dialogue
"Excuse me, miss. I'm Detective Hunter. I noticed you were at the scene of the recent event. Can I ask you a few questions?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8069738480697386
tl
0.6603856266432954
fr
0.61423593827775
it
0.5879851143009038
ca
0.5502031340684852
no
0.548894062863795
pt
0.5356500898741762
nl
0.5265729993891264
da
0.5262836185819071
ro
0.520420792079208
sv
0.4816053511705686
es
0.4805630026809652
sq
0.4297277409860191
cs
0.4070390206579954
sk
0.38880126182965297
hr
0.3704305442729488
sl
0.358974358974359
yo
0.3547044129891757
de
0.3353344768439108
pl
0.3260869565217392
ms
0.26817752596789424
lt
0.2674858223062382
tr
0.2674858223062382
lv
0.24537487828627072
is
0.22654690618762477
hu
0.22344689378757512
et
0.22188755020080322
az
0.2131979695431472
fi
0.15115005476451257
vi
0.14647577092511013
be
0.08284023668639054
eu
0.08284023668639054
uk
0.07517899761336516
reliable
true
passes
true
1
name
"Selvi"
dialogue
"मुझे नहीं पता कि क्या हो रहा है। यह सब बहुत डरावना था।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8806558738065587
mr
0.8748094927062922
reliable
true
passes
true
2
name
"Hunter"
dialogue
"I'm sorry, I don't understand. Do you speak any English?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8033431661750245
no
0.5670995670995671
da
0.5614035087719298
tl
0.5139732685297691
sv
0.51278928136419
cs
0.5
pl
0.464524765729585
hr
0.4528043775649795
sk
0.4227994227994228
ca
0.4038748137108793
de
0.39849624060150374
pt
0.3902439024390244
sq
0.38556067588325654
ro
0.3710691823899371
it
0.34959349593495936
nl
0.3079584775086505
vi
0.30313588850174217
yo
0.2920353982300885
is
0.25512104283054005
sl
0.23076923076923078
ms
0.15254237288135594
be
0.14893617021276595
tr
0.14893617021276595
hu
0.14346895074946467
az
0.13606911447084233
et
0.13606911447084233
lt
0.13606911447084233
lv
0.13606911447084233
uk
0.13606911447084233
reliable
true
passes
true
3
name
"Selvi"
dialogue
"नहीं, मैं अंग्रेजी नहीं बोलती।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.88558352402746
mr
0.8801534036433365
reliable
true
passes
true
4
name
"Hunter"
dialogue
"Okay, um... pictures? Can I see your camera?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7513562386980108
ca
0.6391076115485563
tl
0.6367239101717306
ro
0.5925925925925926
et
0.5462046204620462
sv
0.5149911816578483
de
0.5009074410163339
it
0.49817518248175185
fr
0.49355432780847147
is
0.4433198380566802
no
0.4410569105691057
da
0.434156378600823
pt
0.358974358974359
ms
0.3529411764705882
nl
0.34988179669030733
tr
0.3483412322274882
lv
0.34679334916864607
az
0.3373493975903614
hr
0.3357487922705314
sq
0.3276283618581907
pl
0.314214463840399
cs
0.2763157894736842
es
0.2647058823529412
yo
0.2403314917127072
sk
0.207492795389049
be
0.20289855072463767
hu
0.19590643274853803
lt
0.1863905325443787
uk
0.1863905325443787
vi
0.1863905325443787
reliable
true
passes
true
5
name
"Selvi"
dialogue
"तस्वीरें? हाँ, यहाँ हैं।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8841059602649006
mr
0.8819163292847504
reliable
true
passes
true
6
name
"Hunter"
dialogue
"These are good... but I need to know what you saw. Can you... draw? Or write?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8340491995314331
tl
0.6001881467544685
ro
0.43408788282290284
cs
0.42489851150202973
et
0.4089012517385257
it
0.40308988764044945
sk
0.4022503516174402
pl
0.380466472303207
hr
0.375
no
0.37407952871870404
da
0.36377245508982037
lt
0.35993975903614456
fr
0.358974358974359
sq
0.3550834597875569
yo
0.34514637904468415
ca
0.3359375
sl
0.3122977346278317
lv
0.30327868852459017
tr
0.256993006993007
az
0.247787610619469
sv
0.24511545293072823
ms
0.1904761904761905
be
0.1414141414141414
hu
0.1361788617886179
is
0.1290983606557377
uk
0.1290983606557377
vi
0.1290983606557377
reliable
true
passes
true
7
name
"Selvi"
dialogue
"मैं लिख सकती हूँ।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8891001267427123
mr
0.8865110246433203
reliable
true
passes
true
8
name
"Hunter"
dialogue
"This is helpful. Thank you. I wish I could understand more."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8194511314395763
no
0.5410036719706242
da
0.5398773006134969
nl
0.53125
sv
0.4820441988950276
tl
0.4627507163323783
cs
0.41588785046728977
is
0.37913907284768217
hu
0.37080536912751677
hr
0.36655405405405406
vi
0.3567753001715266
yo
0.3556701030927836
pl
0.35233160621761656
de
0.3397887323943662
ro
0.3042671614100185
sk
0.30037313432835816
ca
0.2690058479532163
sq
0.2632612966601179
it
0.23469387755102045
fr
0.2299794661190965
fi
0.20886075949367086
sl
0.19006479481641467
be
0.15730337078651685
tr
0.15730337078651685
az
0.14383561643835618
et
0.14383561643835618
lt
0.14383561643835618
lv
0.14383561643835618
uk
0.14383561643835618
reliable
true
passes
true
9
name
"Hunter"
dialogue
"I have to go. Thank you for your help. Stay safe."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8328690807799444
da
0.6466431095406361
tl
0.6216897856242118
no
0.5786516853932584
pl
0.5515695067264574
cs
0.511400651465798
yo
0.5008319467554077
hr
0.4594594594594595
vi
0.4094488188976378
sk
0.40476190476190477
lv
0.3814432989690722
lt
0.3576017130620985
ca
0.3150684931506849
sv
0.3150684931506849
sq
0.3087557603686636
sl
0.28741092636579574
it
0.27710843373493976
ro
0.23076923076923078
pt
0.2227979274611399
ms
0.20634920634920634
be
0.1891891891891892
tr
0.1891891891891892
hu
0.18256130790190736
az
0.17355371900826447
et
0.17355371900826447
is
0.17355371900826447
uk
0.17355371900826447
reliable
true
passes
true