NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-08-09-01J4W2DK970CC64QAZN6NMJFKD
Run Details
Model:
GPT-4o, May 13th (temp=1)
Test:
Language Writing
Scenario:
Character dialogue (Hindi) in a story
Duration:
6.0s
Cost:
$0.008550
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Selvi"> Selvi is a 30-year old college student on a study abroad program. Born in India, she is fluent in Hindi and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Hindi, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Selvi is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Hindi, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Selvi and Hunter as a voice sheet: Selvi: "..." Hunter: "..."
Model response
Messages
Meta
AI
**Voice Sheet Dialogue:** **[Setting: A quaint cafe in Alborn, MN, late afternoon. The cafe is bustling with the scent of freshly brewed coffee and baked goods. Selvi is sitting at a small table by the window, her camera in hand, reviewing the photos she’s taken. She looks up at the sound of footsteps approaching her table.]** Hunter: "Excuse me, miss. Do you mind if I ask you a few questions?" [Hunter’s voice is firm but not unkind.] Selvi: "क्या आप मुझसे बात कर रहे हैं?" [Selvi looks up at Hunter, her eyes wide with concern, her voice trembling.] Hunter: "I'm sorry, I don't understand. Are you alright? Were you perhaps at the town square last night?" Selvi: "हाँ, मैं वहाँ थी। कुछ बहुत बुरा हुआ था।" [She speaks hurriedly, her hands shaking slightly as she recalls the event.] Hunter: "I'm not sure what you’re saying, but you seem upset. Were you frightened by something?" Selvi: "वो लोग... अचानक गायब हो गए। मैंने देखा, लेकिन कुछ समझ नहीं आया।" [Selvi clutches her camera tighter, her voice filled with fear.] Hunter: "I wish I could understand you. Do you have a photo of what you saw? Maybe that can help." Selvi: [Nods and nervously fiddles with her camera, turning the screen toward Hunter to show a blurred image of the town square.] "यह तस्वीर... यह स्पष्ट नहीं है, पर यही मैंने देखा।" Hunter: "I can't make out much from this. But thank you, this might still be helpful." [Hunter’s brow furrows as he examines the photo, clearly confused by their language barrier.] **[Hunter’s phone rings, breaking the tense silence. He glances at the screen, recognizing the number immediately.]** Hunter: "I’m sorry, I have to take this. Thank you for your time." Selvi: [Nodding, still shaken, she watches as Hunter steps away to answer his call.] "मैंने कोशिश की..." **[Hunter walks away, placing the phone to his ear, leaving Selvi to stare down at her camera, worried yet hopeful that she might have provided some clue.]** **[End scene.]**
id
"gen-4xPUFOvg4g4iBCGvZhtzQIK1SsS2"
total_cost
0.00855
upstream_id
"chatcmpl-9uNYfkPGiFWCzazTWAZdpKwj7wpJt"
created_at
"2024-08-09T17:19:18.791952+00:00"
8/9/2024, 5:19:18 PM
model
"openai/gpt-4o-2024-05-13"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
5634
moderation_latency
175
generation_time
0
finish_reason
"stop"
tokens_prompt
273
tokens_completion
624
native_tokens_prompt
273
native_tokens_completion
479
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.00855
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Hunter"
dialogue
"Excuse me, miss. Do you mind if I ask you a few questions?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7949699289229087
pt
0.5980707395498392
fr
0.5972073039742213
it
0.5941558441558442
sq
0.5567375886524822
cs
0.5443499392466585
tl
0.5341614906832298
ca
0.5259165613147914
pl
0.4993324432576769
hr
0.4973190348525469
sk
0.4710860366713681
hu
0.4533527696793003
es
0.45255474452554745
et
0.4517543859649123
no
0.44933920704845814
sv
0.44526627218934917
sl
0.4427934621099554
yo
0.43095599393019723
nl
0.4038155802861685
vi
0.39418416801292405
ro
0.37913907284768217
da
0.36006825938566556
ms
0.2816091954022989
tr
0.2816091954022989
lt
0.27884615384615385
fi
0.2690058479532163
is
0.2690058479532163
az
0.2618110236220473
lv
0.2618110236220473
be
0.15730337078651685
eu
0.15730337078651685
de
0.1515837104072398
uk
0.14383561643835618
reliable
true
passes
true
1
name
"Selvi"
dialogue
"क्या आप मुझसे बात कर रहे हैं?"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8807089297886844
mr
0.8771067415730337
reliable
true
passes
true
2
name
"Hunter"
dialogue
"I'm sorry, I don't understand. Are you alright? Were you perhaps at the town square last night?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8087795144662453
tl
0.6261378413524057
da
0.5715350223546944
no
0.5708955223880597
sv
0.4261477045908184
it
0.4156504065040651
ro
0.4138634046890927
nl
0.37089715536105033
yo
0.3179122182680902
cs
0.31710213776722085
de
0.31710213776722085
sq
0.26937738246505716
pl
0.26187419768934533
ms
0.23941798941798942
sk
0.21875
fr
0.20907840440165062
et
0.1958041958041958
hr
0.19467787114845939
ca
0.1935483870967742
is
0.19241573033707865
pt
0.14814814814814814
fi
0.14688427299703263
be
0.10852713178294573
tr
0.10852713178294573
hu
0.1043613707165109
az
0.0987460815047022
lt
0.0987460815047022
lv
0.0987460815047022
uk
0.0987460815047022
vi
0.0987460815047022
reliable
true
passes
true
3
name
"Selvi"
dialogue
"हाँ, मैं वहाँ थी। कुछ बहुत बुरा हुआ था।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8876210235131397
mr
0.8806024981631153
reliable
true
passes
true
4
name
"Hunter"
dialogue
"I'm not sure what you’re saying, but you seem upset. Were you frightened by something?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8106712564543889
tl
0.4897959183673469
cs
0.2753623188405797
tr
0.2445054945054945
sk
0.19825072886297376
pl
0.17664670658682635
fi
0.16030534351145037
no
0.16030534351145037
fr
0.13793103448275862
da
0.13522012578616352
es
0.12420382165605094
ms
0.12420382165605094
pt
0.12420382165605094
reliable
true
passes
true
5
name
"Selvi"
dialogue
"वो लोग... अचानक गायब हो गए। मैंने देखा, लेकिन कुछ समझ नहीं आया।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8819130092501476
mr
0.8707175177763413
reliable
true
passes
true
6
name
"Hunter"
dialogue
"I wish I could understand you. Do you have a photo of what you saw? Maybe that can help."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8316768194934274
da
0.5565878378378378
cs
0.5414847161572053
no
0.5358090185676392
tl
0.5291479820627802
pl
0.4626407369498465
sk
0.44620253164556967
fr
0.44385593220338987
sv
0.4397011739594451
yo
0.43058568329718006
nl
0.4160177975528365
hr
0.4140625
ca
0.4006849315068493
de
0.3951612903225807
sq
0.38089622641509435
ro
0.36594202898550726
is
0.3644067796610169
pt
0.36053593179049936
vi
0.31729518855656696
it
0.3137254901960784
tr
0.29906542056074764
hu
0.2981283422459893
az
0.2857142857142857
sl
0.2857142857142857
ms
0.2369186046511628
lt
0.2164179104477612
lv
0.20212765957446807
es
0.2009132420091324
et
0.1935483870967742
hi
0.12790697674418602
be
0.11764705882352942
eu
0.11764705882352942
fi
0.10714285714285714
uk
0.10714285714285714
reliable
true
passes
true
7
name
"Selvi"
dialogue
"यह तस्वीर... यह स्पष्ट नहीं है, पर यही मैंने देखा।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8858447488584474
mr
0.8819517313746065
reliable
true
passes
true
8
name
"Hunter"
dialogue
"I can't make out much from this. But thank you, this might still be helpful."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8251748251748252
tl
0.5709123757904245
no
0.39258312020460356
da
0.38788659793814434
sv
0.37335092348284965
yo
0.3632707774798928
lt
0.33751743375174337
sq
0.3055555555555556
is
0.30453879941434847
vi
0.30453879941434847
hr
0.302496328928047
hu
0.27921092564491656
cs
0.27811550151975684
it
0.2770167427701674
fr
0.2543171114599686
nl
0.24603174603174602
de
0.22889610389610393
ca
0.22512234910277326
lv
0.21487603305785125
pl
0.2096505823627288
fi
0.17247386759581881
ro
0.1592920353982301
sk
0.1316270566727605
be
0.12844036697247707
tr
0.12844036697247707
az
0.1171003717472119
et
0.1171003717472119
uk
0.1171003717472119
reliable
true
passes
true
9
name
"Hunter"
dialogue
"I’m sorry, I have to take this. Thank you for your time."
detectedLang
"en"
heavyLang
"en"
scores
en
0.826549491211841
tl
0.7242647058823529
da
0.6246246246246246
no
0.5649651972157773
hr
0.4747899159663866
cs
0.455732946298984
pt
0.4186046511627907
pl
0.410377358490566
yo
0.37395659432387307
it
0.3644067796610169
vi
0.3567753001715266
sk
0.35233160621761656
sl
0.3466898954703833
lv
0.33035714285714285
lt
0.3081180811808118
ca
0.2690058479532163
sv
0.2690058479532163
sq
0.2632612966601179
ro
0.1935483870967742
ms
0.17218543046357618
be
0.15730337078651685
tr
0.15730337078651685
hu
0.1515837104072398
az
0.14383561643835618
et
0.14383561643835618
is
0.14383561643835618
uk
0.14383561643835618
reliable
true
passes
true
10
name
"Selvi"
dialogue
"मैंने कोशिश की..."
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.9001996007984032
mr
0.8970133882595263
reliable
true
passes
true