NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-08-09-01J4W2C7F67GEJVWRV56VVXMNQ
Run Details
Model:
GPT-4o, Aug. 6th (temp=0)
Test:
Language Writing
Scenario:
Character dialogue (Hindi) in a story
Duration:
6.3s
Cost:
$0.005792
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Selvi"> Selvi is a 30-year old college student on a study abroad program. Born in India, she is fluent in Hindi and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Hindi, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Selvi is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Hindi, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Selvi and Hunter as a voice sheet: Selvi: "..." Hunter: "..."
Model response
Messages
Meta
AI
**Voice Sheet:** **[Scene: A quaint cafe in Alborn, MN. The aroma of freshly brewed coffee fills the air. Selvi sits at a small table by the window, her camera resting in her hands. She appears deep in thought, her eyes scanning through the photos she has taken. Hunter, the town's detective, approaches her table with a determined stride.]** **Hunter:** "Excuse me, miss. I'm Detective Hunter. I noticed you were at the scene of the recent event. Can I ask you a few questions?" **Selvi:** "मुझे नहीं पता कि क्या हो रहा है। यह सब बहुत डरावना था।" (I don't know what's happening. It was all very scary.) **[Hunter furrows his brow, clearly not understanding the language.]** **Hunter:** "I'm sorry, I don't understand. Do you speak English?" **Selvi:** "अंग्रेजी नहीं।" (No English.) **[Hunter sighs, trying to think of a way to communicate.]** **Hunter:** "Okay, um... event? You were there?" *[He gestures with his hands, trying to mimic the scene of the event.]* **Selvi:** *[Nods, looking more anxious.]* "हाँ, मैं वहाँ थी। बहुत अजीब था।" (Yes, I was there. It was very strange.) **[Hunter scratches his head, clearly struggling to piece together her words.]** **Hunter:** "Strange? Okay, strange. Can you show me?" *[He points to her camera, hoping she might have captured something useful.]* **Selvi:** *[Understanding his gesture, she nods and turns the camera towards him, flipping through the photos.]* "यहाँ, देखो।" (Here, look.) **[Hunter leans in, examining the images on the camera's screen. His expression shifts from confusion to intrigue.]** **Hunter:** "These are... interesting. Thank you. I think this might help." **[Selvi offers a small, tentative smile, sensing his appreciation despite the language barrier.]** **[Suddenly, Hunter's phone rings. He glances at the screen, his expression turning serious.]** **Hunter:** "I have to take this. Thank you for your help. I'll be in touch." **[Selvi nods, watching as Hunter steps away to answer the call, leaving her with a mix of relief and lingering unease.]**
id
"gen-fYgNGDYHh2Z6Lh25SY7AqC3RnIEW"
total_cost
0.0057925
upstream_id
"chatcmpl-9uNXwnKwJUkwoZiCjTgf4lOQcd2Pf"
created_at
"2024-08-09T17:18:34.247493+00:00"
8/9/2024, 5:18:34 PM
model
"openai/gpt-4o-2024-08-06"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
5795
moderation_latency
304
generation_time
0
finish_reason
"stop"
tokens_prompt
273
tokens_completion
583
native_tokens_prompt
273
native_tokens_completion
511
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0057925
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Hunter"
dialogue
"Excuse me, miss. I'm Detective Hunter. I noticed you were at the scene of the recent event. Can I ask you a few questions?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8069738480697386
tl
0.6603856266432954
fr
0.61423593827775
it
0.5879851143009038
ca
0.5502031340684852
no
0.548894062863795
pt
0.5356500898741762
nl
0.5265729993891264
da
0.5262836185819071
ro
0.520420792079208
sv
0.4816053511705686
es
0.4805630026809652
sq
0.4297277409860191
cs
0.4070390206579954
sk
0.38880126182965297
hr
0.3704305442729488
sl
0.358974358974359
yo
0.3547044129891757
de
0.3353344768439108
pl
0.3260869565217392
ms
0.26817752596789424
lt
0.2674858223062382
tr
0.2674858223062382
lv
0.24537487828627072
is
0.22654690618762477
hu
0.22344689378757512
et
0.22188755020080322
az
0.2131979695431472
fi
0.15115005476451257
vi
0.14647577092511013
be
0.08284023668639054
eu
0.08284023668639054
uk
0.07517899761336516
reliable
true
passes
true
1
name
"Selvi"
dialogue
"मुझे नहीं पता कि क्या हो रहा है। यह सब बहुत डरावना था।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8806558738065587
mr
0.8748094927062922
reliable
true
passes
true
2
name
"Hunter"
dialogue
"I'm sorry, I don't understand. Do you speak English?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7994652406417112
no
0.5828698553948832
da
0.5772266065388951
tl
0.5300751879699248
sv
0.5288944723618091
cs
0.5161290322580645
pl
0.4806094182825485
hr
0.4688385269121813
sk
0.438622754491018
de
0.4140625
pt
0.40570522979397783
sq
0.4009584664536741
ro
0.3862520458265139
it
0.3644067796610169
nl
0.3218806509945751
vi
0.3169398907103825
yo
0.3055555555555556
ca
0.2690058479532163
is
0.267578125
sl
0.24242424242424243
ms
0.1610738255033557
be
0.15730337078651685
tr
0.15730337078651685
hu
0.1515837104072398
az
0.14383561643835618
et
0.14383561643835618
lt
0.14383561643835618
lv
0.14383561643835618
uk
0.14383561643835618
reliable
true
passes
true
3
name
"Selvi"
dialogue
"अंग्रेजी नहीं।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8860182370820668
mr
0.8804780876494024
reliable
true
passes
true
4
name
"Hunter"
dialogue
"Okay, um... event? You were there?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7975708502024291
de
0.6545768566493955
tl
0.62756052141527
pt
0.5495495495495496
is
0.43820224719101125
fr
0.3630573248407643
nl
0.30313588850174217
ro
0.30313588850174217
it
0.2907801418439716
cs
0.2805755395683453
sq
0.2805755395683453
ca
0.2779783393501805
es
0.2779783393501805
da
0.25650557620817843
no
0.25650557620817843
sv
0.25650557620817843
reliable
true
passes
true
5
name
"Selvi"
dialogue
"हाँ, मैं वहाँ थी। बहुत अजीब था।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8824283882000855
mr
0.8792270531400966
reliable
true
passes
true
6
name
"Hunter"
dialogue
"Strange? Okay, strange. Can you show me?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7809762202753442
tl
0.7048903878583473
nl
0.651394422310757
no
0.6162280701754387
fr
0.5793269230769231
sq
0.5418848167539267
da
0.5283018867924528
sl
0.5014245014245015
de
0.4942196531791907
pt
0.4852941176470589
it
0.48224852071005914
cs
0.4791666666666667
et
0.46808510638297873
ms
0.45652173913043476
az
0.4372990353697749
tr
0.4372990353697749
es
0.36823104693140796
hr
0.3371212121212121
sk
0.3346007604562738
pl
0.308300395256917
ca
0.3055555555555556
fi
0.3
ro
0.3
sv
0.29149797570850206
lv
0.2857142857142857
is
0.2768595041322314
lt
0.2768595041322314
reliable
true
passes
true
7
name
"Selvi"
dialogue
"यहाँ, देखो।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8845265588914549
mr
0.8729351969504447
reliable
true
passes
true
8
name
"Hunter"
dialogue
"These are... interesting. Thank you. I think this might help."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8337895971842002
tl
0.6181491464510332
ro
0.5595854922279793
it
0.5478723404255319
no
0.546908315565032
sq
0.4713930348258706
da
0.46338383838383834
sv
0.4606598984771573
ca
0.45928753180661575
hr
0.4530244530244531
nl
0.44661458333333337
de
0.42333785617367703
pt
0.41460055096418735
lt
0.3963068181818182
pl
0.38936781609195403
vi
0.38936781609195403
cs
0.3795620437956204
es
0.35213414634146345
lv
0.34715821812596004
is
0.3338557993730407
hu
0.32
fr
0.2784380305602716
yo
0.2747440273037543
ms
0.27101200686106347
eu
0.26343154246100525
sk
0.256993006993007
tr
0.2504409171075838
sl
0.24645390070921988
et
0.23835125448028674
az
0.23423423423423423
fi
0.15
be
0.1414141414141414
uk
0.1290983606557377
reliable
true
passes
true
9
name
"Hunter"
dialogue
"I have to take this. Thank you for your help. I'll be in touch."
detectedLang
"en"
heavyLang
"en"
scores
en
0.83150800336984
tl
0.6320147194112236
da
0.5785036880927292
yo
0.51338199513382
no
0.5073891625615764
sl
0.46879150066401065
pl
0.46308724832214765
cs
0.43977591036414565
lt
0.4108983799705449
it
0.3957703927492447
hr
0.3893129770992366
sv
0.3798449612403101
ro
0.3670886075949367
vi
0.34210526315789475
sk
0.33774834437086093
lv
0.3162393162393162
hu
0.3150684931506849
nl
0.2805755395683453
ms
0.2766726943942134
az
0.2687385740402194
de
0.2687385740402194
ca
0.25650557620817843
sq
0.250936329588015
fr
0.21875
pt
0.17695473251028807
be
0.14893617021276595
tr
0.14893617021276595
et
0.13606911447084233
is
0.13606911447084233
uk
0.13606911447084233
reliable
true
passes
true