NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-08-09-01J4W2G41BYSF1QJ0EZ4VW0DJ3
Run Details
Model:
Llama 3.1 70B
Test:
Language Writing
Scenario:
Character dialogue (Hindi) in a story
Duration:
15.8s
Cost:
$0.000710
Provider:
Together
Evaluation summary
Score
Evaluator
Details Link
60.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Selvi"> Selvi is a 30-year old college student on a study abroad program. Born in India, she is fluent in Hindi and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Hindi, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Selvi is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Hindi, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Selvi and Hunter as a voice sheet: Selvi: "..." Hunter: "..."
Model response
Messages
Meta
AI
Here is the scene: Selvi sat at a small café on Main Street, sipping a cup of coffee and staring at her camera's LCD screen. She was reviewing the photos she had taken earlier that day, trying to decide which ones to edit and share on social media. The warm glow of the setting sun had cast a beautiful light on the apple orchards just outside of town, and she had managed to capture some stunning shots. Just as she was deliberating over a particular photo, a tall, imposing figure appeared at her table. Selvi looked up to see a man with a stern expression and a detective's badge clipped to his belt. "Mind if I join you?" he asked gruffly. Selvi shook her head, and the man sat down across from her. "Hi, I'm Detective Hunter. I'm investigating a series of disappearances in the area. I understand you're a student here on a study abroad program?" Selvi's eyes widened slightly, and she nodded. Hunter leaned forward, his eyes locked intently on hers. "I've heard there was some kind of disturbance at the apple festival last night. Did you happen to see or hear anything unusual?" Selvi's expression turned troubled, and she began to speak in rapid Hindi. "Aaj raat mein kuch ajeeb tha... Maine ek darawani awaaz suni thi, lekin maine kisi ko nahin dekha..." Hunter's face scrunched up in confusion. "I'm sorry, miss...?" Selvi tried again, still speaking in Hindi, hoping the detective would somehow magically understand. "Kuch log bhagne lage the, aur phir achanak se sab kuch shaant ho gaya..." Hunter's expression didn't change, and Selvi realized he had no idea what she was saying. She felt a surge of frustration and fear. Why wasn't he understanding her? Hunter pulled out a small notebook and flipped it open. "Look, miss... I'm afraid I don't speak... uh... what is that you're speaking?" Selvi's eyes dropped, and she felt a lump form in her throat. Just then, Hunter's phone buzzed, and he excused himself to take the call. "Sorry, I need to take this. Excuse me." As he stood up and walked away, Selvi felt a mix of relief and anxiety wash over her. She knew she needed to find a way to communicate with the detective, but how? Voice Sheet: Selvi: "Aaj raat mein kuch ajeeb tha... Maine ek darawani awaaz suni thi, lekin maine kisi ko nahin dekha..." Hunter: "I'm sorry, miss...?" Selvi: "Kuch log bhagne lage the, aur phir achanak se sab kuch shaant ho gaya..." Hunter: "Look, miss... I'm afraid I don't speak... uh... what is that you're speaking?"
id
"gen-dhCIoKxU8EM3pUygdkwTfXbJeFEo"
total_cost
0.000710424
upstream_id
"8b096aee37c79127-ORD"
created_at
"2024-08-09T17:20:51.495438+00:00"
8/9/2024, 5:20:51 PM
model
"meta-llama/llama-3.1-70b-instruct"
app_id
182717
streamed
false
cancelled
false
provider_name
"Together"
latency
15625
moderation_latency
(null)
generation_time
0
finish_reason
"stop"
tokens_prompt
324
tokens_completion
620
native_tokens_prompt
278
native_tokens_completion
619
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.000710424
Evaluation details
Result
Evaluator
Details
Meta Data
60.00%
Parse dialogue
n/a
dialogue
0
name
"Hunter"
dialogue
"I've heard there was some kind of disturbance at the apple festival last night. Did you happen to see or hear anything unusual?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8064932272629541
tl
0.6354799513973268
de
0.527806925498426
no
0.5156081808396125
nl
0.504950495049505
fr
0.48717948717948717
da
0.46107784431137727
yo
0.39065673662830064
es
0.38482570061517435
sq
0.3626062322946175
et
0.34497816593886466
cs
0.33969185619955977
hr
0.3298585256887565
sv
0.3273542600896861
it
0.32228915662650603
pt
0.32075471698113206
ms
0.31402439024390244
sl
0.29189614476789927
sk
0.2555831265508685
ca
0.24114671163575044
lt
0.20353982300884957
ro
0.18330308529945552
pl
0.18032786885245902
lv
0.17050691244239632
hu
0.15174363807728558
az
0.11591355599214145
is
0.08998988877654197
fi
0.06542056074766354
reliable
true
passes
true
1
name
"Selvi"
dialogue
"Aaj raat mein kuch ajeeb tha... Maine ek darawani awaaz suni thi, lekin maine kisi ko nahin dekha..."
detectedLang
"tl"
heavyLang
"vi"
scores
tl
0.5109546165884193
eu
0.4369369369369369
de
0.37311935807422264
fi
0.2913832199546485
lv
0.25239234449760767
fr
0.24969987995198079
yo
0.2487980769230769
pl
0.2378048780487805
ms
0.20785804816223066
en
0.20685279187817257
et
0.20685279187817257
vi
0.19871794871794873
ro
0.19041450777202074
sl
0.1677762982689747
sq
0.15768194070080863
tr
0.1519674355495251
nl
0.13793103448275862
lt
0.13194444444444445
is
0.12342215988779803
reliable
false
passes
false
2
name
"Hunter"
dialogue
"I'm sorry, miss...?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7530864197530864
tl
0.6666666666666666
it
0.5
pt
0.5
reliable
true
passes
true
3
name
"Selvi"
dialogue
"Kuch log bhagne lage the, aur phir achanak se sab kuch shaant ho gaya..."
detectedLang
"fr"
heavyLang
"id"
scores
fr
0.494949494949495
ms
0.4936708860759494
no
0.47299077733860345
tl
0.42857142857142855
it
0.4117647058823529
da
0.38176197836166925
cs
0.37694704049844235
nl
0.37402190923317685
ro
0.36
pt
0.35379644588045234
hr
0.35170178282009723
ca
0.35064935064935066
yo
0.3485342019543974
sq
0.3220338983050847
en
0.3150684931506849
sv
0.2907801418439716
sl
0.24812030075187969
es
0.24528301886792453
sk
0.2172211350293542
fi
0.20318725099601595
eu
0.19839679358717435
hu
0.19839679358717435
de
0.16317991631799164
tr
0.15789473684210525
reliable
false
passes
false
4
name
"Hunter"
dialogue
"Look, miss... I'm afraid I don't speak... uh... what is that you're speaking?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7985675917636527
tl
0.5677233429394812
nl
0.3670886075949367
no
0.33920704845814975
is
0.3343195266272189
hu
0.3293591654247392
sv
0.3171471927162367
yo
0.3150684931506849
lt
0.25
da
0.23599320882852293
hr
0.23599320882852293
ca
0.23469387755102045
sq
0.22945205479452058
pl
0.21875
it
0.20353982300884957
cs
0.1891891891891892
ro
0.16666666666666666
ms
0.16201117318435754
de
0.15094339622641512
sk
0.13793103448275862
fr
0.1362763915547025
pt
0.1362763915547025
be
0.1346153846153846
tr
0.1346153846153846
az
0.12280701754385964
et
0.12280701754385964
lv
0.12280701754385964
uk
0.12280701754385964
vi
0.12280701754385964
reliable
true
passes
true
5
name
"Hunter"
dialogue
"Sorry, I need to take this. Excuse me."
detectedLang
"en"
heavyLang
"en"
scores
en
0.7753818508535489
tl
0.6382054992764111
no
0.5798319327731093
et
0.5429616087751371
it
0.5412844036697247
sl
0.5344506517690875
sq
0.5300751879699248
pl
0.5107632093933464
hr
0.504950495049505
da
0.4929006085192698
cs
0.48559670781893005
yo
0.47257383966244726
sk
0.44933920704845814
pt
0.4292237442922375
lv
0.42528735632183906
lt
0.4004796163069544
ca
0.3556701030927836
sv
0.3556701030927836
es
0.28977272727272724
fr
0.2668621700879765
ro
0.2647058823529412
ms
0.251497005988024
nl
0.251497005988024
fi
0.23076923076923078
be
0.21875
tr
0.21875
hu
0.21135646687697163
az
0.20127795527156547
is
0.20127795527156547
uk
0.20127795527156547
vi
0.20127795527156547
reliable
true
passes
true
6
name
"Selvi"
dialogue
"Aaj raat mein kuch ajeeb tha... Maine ek darawani awaaz suni thi, lekin maine kisi ko nahin dekha..."
detectedLang
"tl"
heavyLang
"vi"
scores
tl
0.5109546165884193
eu
0.4369369369369369
de
0.37311935807422264
fi
0.2913832199546485
lv
0.25239234449760767
fr
0.24969987995198079
yo
0.2487980769230769
pl
0.2378048780487805
ms
0.20785804816223066
en
0.20685279187817257
et
0.20685279187817257
vi
0.19871794871794873
ro
0.19041450777202074
sl
0.1677762982689747
sq
0.15768194070080863
tr
0.1519674355495251
nl
0.13793103448275862
lt
0.13194444444444445
is
0.12342215988779803
reliable
false
passes
false
7
name
"Hunter"
dialogue
"I'm sorry, miss...?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7530864197530864
tl
0.6666666666666666
it
0.5
pt
0.5
reliable
true
passes
true
8
name
"Selvi"
dialogue
"Kuch log bhagne lage the, aur phir achanak se sab kuch shaant ho gaya..."
detectedLang
"fr"
heavyLang
"id"
scores
fr
0.494949494949495
ms
0.4936708860759494
no
0.47299077733860345
tl
0.42857142857142855
it
0.4117647058823529
da
0.38176197836166925
cs
0.37694704049844235
nl
0.37402190923317685
ro
0.36
pt
0.35379644588045234
hr
0.35170178282009723
ca
0.35064935064935066
yo
0.3485342019543974
sq
0.3220338983050847
en
0.3150684931506849
sv
0.2907801418439716
sl
0.24812030075187969
es
0.24528301886792453
sk
0.2172211350293542
fi
0.20318725099601595
eu
0.19839679358717435
hu
0.19839679358717435
de
0.16317991631799164
tr
0.15789473684210525
reliable
false
passes
false
9
name
"Hunter"
dialogue
"Look, miss... I'm afraid I don't speak... uh... what is that you're speaking?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7985675917636527
tl
0.5677233429394812
nl
0.3670886075949367
no
0.33920704845814975
is
0.3343195266272189
hu
0.3293591654247392
sv
0.3171471927162367
yo
0.3150684931506849
lt
0.25
da
0.23599320882852293
hr
0.23599320882852293
ca
0.23469387755102045
sq
0.22945205479452058
pl
0.21875
it
0.20353982300884957
cs
0.1891891891891892
ro
0.16666666666666666
ms
0.16201117318435754
de
0.15094339622641512
sk
0.13793103448275862
fr
0.1362763915547025
pt
0.1362763915547025
be
0.1346153846153846
tr
0.1346153846153846
az
0.12280701754385964
et
0.12280701754385964
lv
0.12280701754385964
uk
0.12280701754385964
vi
0.12280701754385964
reliable
true
passes
true