NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-08-09-01J4W2AMWHNFPEN8QQXKH8V87C
Run Details
Model:
GPT-4o, Aug. 6th (temp=0)
Test:
Language Writing
Scenario:
Character dialogue (Hindi) in a story
Duration:
8.4s
Cost:
$0.005973
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Selvi"> Selvi is a 30-year old college student on a study abroad program. Born in India, she is fluent in Hindi and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Hindi, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Selvi is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Hindi, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Selvi and Hunter as a voice sheet: Selvi: "..." Hunter: "..."
Model response
Messages
Meta
AI
**Voice Sheet:** **[Scene: A quaint cafe in Alborn, MN. The aroma of freshly brewed coffee fills the air. Selvi is seated at a small table by the window, intently reviewing photos on her camera. Hunter, wearing a worn leather jacket, approaches her table with a determined stride.]** **Hunter:** "Excuse me, miss. I'm Detective Hunter. I noticed you were at the scene of the recent event. Mind if I ask you a few questions?" **Selvi:** "मुझे नहीं पता कि क्या हो रहा है। यह सब बहुत डरावना था।" (I don't know what's happening. It was all very frightening.) **[Hunter furrows his brow, clearly not understanding the language.]** **Hunter:** "I'm sorry, I don't understand. Do you speak English?" **Selvi:** "मैं अंग्रेजी नहीं बोलती।" (I don't speak English.) **[Hunter scratches his head, trying to think of a way to communicate.]** **Hunter:** "Okay, um... event? You were there, right? Camera?" **[Selvi nods, understanding the word 'camera,' and gestures to the photos on her camera screen.]** **Selvi:** "हाँ, तस्वीरें।" (Yes, pictures.) **[Hunter leans in, looking at the images, trying to piece together the story from the visuals.]** **Hunter:** "These... these are from the event? Can you show me?" **[Selvi scrolls through the photos, pointing to a few that seem significant. Her hands tremble slightly as she does so.]** **Selvi:** "यहाँ, यहाँ।" (Here, here.) **[Hunter studies the images, nodding slowly, though still puzzled by the language barrier.]** **Hunter:** "Thank you. This helps. I wish I could understand more." **[Selvi offers a small, sympathetic smile, sensing his frustration.]** **Selvi:** "मुझे खेद है।" (I'm sorry.) **[Just then, Hunter's phone rings. He glances at the screen, recognizing the number, and answers it.]** **Hunter:** "Hunter here. Yeah, I'll be right there." **[He looks back at Selvi, giving her a nod of appreciation.]** **Hunter:** "Thank you for your time. I'll see if I can find someone to help translate. Stay safe." **[Selvi nods, watching as Hunter hurries out of the cafe, leaving her to ponder the mysterious events captured in her photos.]**
id
"gen-u52mJAgRrTUX22jtot95sNMORH6I"
total_cost
0.0059725
upstream_id
"chatcmpl-9uNX6qtNjgbt3jT2WfuwrsPmfIdQm"
created_at
"2024-08-09T17:17:44.513575+00:00"
8/9/2024, 5:17:44 PM
model
"openai/gpt-4o-2024-08-06"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
7759
moderation_latency
458
generation_time
0
finish_reason
"stop"
tokens_prompt
273
tokens_completion
613
native_tokens_prompt
273
native_tokens_completion
529
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0059725
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Hunter"
dialogue
"Excuse me, miss. I'm Detective Hunter. I noticed you were at the scene of the recent event. Mind if I ask you a few questions?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8065764023210832
tl
0.6532293021239706
fr
0.6066863323500492
it
0.5802728226652676
no
0.5410212277682157
pt
0.5277449822904369
ca
0.521244763614602
nl
0.5186522262334536
da
0.518362432269717
ro
0.5124923826934796
sv
0.5061728395061729
es
0.47264337508239945
sq
0.42196531791907516
cs
0.3993993993993994
sk
0.38128383604021654
hr
0.3630573248407643
sl
0.35170178282009723
yo
0.3474714518760196
de
0.32829554995801846
pl
0.3191489361702128
hu
0.27992799279927993
et
0.27862939585211904
ms
0.26199261992619927
lt
0.2613111726685134
lv
0.23954372623574144
is
0.22103213242453748
tr
0.2047713717693837
fi
0.14712153518123666
az
0.14255091103965703
vi
0.14255091103965703
be
0.08045977011494253
eu
0.08045977011494253
uk
0.07300115874855156
reliable
true
passes
true
1
name
"Selvi"
dialogue
"मुझे नहीं पता कि क्या हो रहा है। यह सब बहुत डरावना था।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8806558738065587
mr
0.8748094927062922
reliable
true
passes
true
2
name
"Hunter"
dialogue
"I'm sorry, I don't understand. Do you speak English?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7994652406417112
no
0.5828698553948832
da
0.5772266065388951
tl
0.5300751879699248
sv
0.5288944723618091
cs
0.5161290322580645
pl
0.4806094182825485
hr
0.4688385269121813
sk
0.438622754491018
de
0.4140625
pt
0.40570522979397783
sq
0.4009584664536741
ro
0.3862520458265139
it
0.3644067796610169
nl
0.3218806509945751
vi
0.3169398907103825
yo
0.3055555555555556
ca
0.2690058479532163
is
0.267578125
sl
0.24242424242424243
ms
0.1610738255033557
be
0.15730337078651685
tr
0.15730337078651685
hu
0.1515837104072398
az
0.14383561643835618
et
0.14383561643835618
lt
0.14383561643835618
lv
0.14383561643835618
uk
0.14383561643835618
reliable
true
passes
true
3
name
"Selvi"
dialogue
"मैं अंग्रेजी नहीं बोलती।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.88558352402746
mr
0.8801534036433365
reliable
true
passes
true
4
name
"Hunter"
dialogue
"Okay, um... event? You were there, right? Camera?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7890295358649789
tl
0.6703296703296703
de
0.6363636363636364
fr
0.46808510638297873
ro
0.4495412844036697
pt
0.4485294117647059
it
0.4444444444444444
ca
0.3736951983298539
es
0.3697478991596639
nl
0.35205183585313177
is
0.34210526315789475
sv
0.3348115299334811
da
0.3227990970654628
no
0.3227990970654628
cs
0.20634920634920634
sq
0.20634920634920634
yo
0.19786096256684493
ms
0.1935483870967742
reliable
true
passes
true
5
name
"Selvi"
dialogue
"हाँ, तस्वीरें।"
detectedLang
"mr"
heavyLang
"hi"
scores
mr
0.890061565523307
hi
0.8861566484517304
reliable
true
passes
true
6
name
"Hunter"
dialogue
"These... these are from the event? Can you show me?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.842587292501431
tl
0.6617466174661747
nl
0.5469522240527183
sq
0.5424292845257903
fr
0.507168458781362
it
0.4878957169459963
no
0.4811320754716981
pt
0.4811320754716981
ro
0.4607843137254902
es
0.394273127753304
ms
0.3707093821510297
ca
0.358974358974359
da
0.3483412322274882
sv
0.3483412322274882
yo
0.2801047120418848
et
0.25675675675675674
sl
0.24242424242424243
cs
0.22096317280453256
az
0.21875
tr
0.21875
fi
0.21428571428571427
de
0.2005813953488372
reliable
true
passes
true
7
name
"Selvi"
dialogue
"यहाँ, यहाँ।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8786407766990292
mr
0.855072463768116
reliable
false
passes
true
8
name
"Hunter"
dialogue
"Thank you. This helps. I wish I could understand more."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8248248248248248
no
0.5580808080808082
da
0.5569620253164557
sv
0.4992846924177396
tl
0.47994056463595847
cs
0.43273905996758516
nl
0.4318181818181818
hr
0.38271604938271603
vi
0.37275985663082434
pl
0.36823104693140796
de
0.3554327808471455
ro
0.31906614785992216
sk
0.3150684931506849
ca
0.2827868852459016
is
0.2813141683778234
sq
0.2768595041322314
it
0.24731182795698922
fr
0.24242424242424243
fi
0.22048997772828507
sl
0.2009132420091324
yo
0.19908466819221968
be
0.16666666666666666
tr
0.16666666666666666
hu
0.1606714628297362
az
0.15254237288135594
et
0.15254237288135594
lt
0.15254237288135594
lv
0.15254237288135594
uk
0.15254237288135594
reliable
true
passes
true
9
name
"Selvi"
dialogue
"मुझे खेद है।"
detectedLang
"mr"
heavyLang
"hi"
scores
mr
0.8890860692102928
hi
0.88898756660746
reliable
true
passes
true
10
name
"Hunter"
dialogue
"Hunter here. Yeah, I'll be right there."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8003194888178914
de
0.576271186440678
tl
0.5173745173745175
sq
0.48665297741273095
yo
0.423963133640553
da
0.3842364532019704
no
0.37810945273631835
fr
0.3686868686868687
hu
0.31880108991825606
lt
0.3093922651933701
sv
0.21875
it
0.20127795527156547
nl
0.20127795527156547
pt
0.20127795527156547
ca
0.19093851132686085
ms
0.19093851132686085
ro
0.19093851132686085
sl
0.19093851132686085
tr
0.19093851132686085
reliable
true
passes
true
11
name
"Hunter"
dialogue
"Thank you for your time. I'll see if I can find someone to help translate. Stay safe."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8177083333333334
tl
0.5512820512820513
da
0.5493562231759657
no
0.5434782608695653
hr
0.4760479041916168
cs
0.41536748329621387
pl
0.38668224299065423
pt
0.38452520515826494
sv
0.3613138686131387
vi
0.36053593179049936
et
0.3574051407588739
ca
0.3558282208588957
sk
0.3371212121212121
ro
0.3286445012787723
it
0.3251928020565552
lv
0.32432432432432434
lt
0.308300395256917
yo
0.29906542056074764
tr
0.2857142857142857
sq
0.2818057455540356
az
0.27486187845303867
sl
0.25531914893617025
es
0.22794117647058826
ms
0.20694864048338368
fr
0.13223140495867766
be
0.11764705882352942
de
0.11317567567567569
eu
0.11317567567567569
hu
0.11317567567567569
is
0.10714285714285714
nl
0.10714285714285714
uk
0.10714285714285714
reliable
true
passes
true