NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-08-09-01J4W1A75NDSQA16AD3YD17A0Y
Run Details
Model:
Llama 3.1 70B
Test:
Language Writing
Scenario:
Character dialogue (Spanish) in a story
Duration:
23.5s
Cost:
$0.000775
Provider:
Fireworks
Evaluation summary
Score
Evaluator
Details Link
75.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Maria"> Maria is a 30-year old college student on a study abroad program. Born in Spain, she is fluent in Spanish and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Spanish, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Maria is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Spanish, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Maria and Hunter as a voice sheet: Maria: "..." Hunter: "..."
Model response
Messages
Meta
AI
Here is the scene: Maria sat at a small table by the window at the Cozy Cup Cafe, her eyes fixed on the camera in front of her. She was scrolling through the photos she had taken that day, her brow furrowed in concentration. Suddenly, a tall, imposing figure loomed over her. She looked up to see a man with a stern expression and a badge clipped to his belt. "Mind if I join you?" he asked gruffly, already pulling out the chair across from her. Maria shook her head, and the man sat down. "I'm Detective Hunter. I'm investigating some... unusual occurrences in town. And you are...?" Maria's eyes widened as she tried to process the question. She hesitated, then launched into a rapid-fire response in Spanish. "¡Ay, caramba! Lo que pasó ayer fue terrible. Estaba tomando fotos en el parque cuando vi a una persona desaparecer en el bosque. ¡Era como si se hubiera esfumado en el aire! Me quedé tan asustada que no pude hacer nada..." Hunter's expression changed from curious to confused. He had no idea what Maria was saying, and his face showed it. He shook his head, trying to process the sounds coming out of her mouth. "I'm sorry, miss... uh...?" Maria's face fell as she realized the detective didn't understand her. She tried again, slower and more calmly this time. "Disculpe, señor... Lo que pasó fue muy extraño. Vi a alguien desaparecer en el bosque. ¡Me tiene muy preocupada!" Hunter's face remained a mask of confusion. Just then, his phone rang. He stood up, apologetic. "I'm sorry, miss... I really need to take this. Excuse me." As he walked away, Maria watched him, still troubled and shaken. She sighed, feeling frustrated that she hadn't been able to communicate with the detective. Here is the voice sheet: Maria: "¡Ay, caramba! Lo que pasó ayer fue terrible. Estaba tomando fotos en el parque cuando vi a una persona desaparecer en el bosque. ¡Era como si se hubiera esfumado en el aire! Me quedé tan asustada que no pude hacer nada...", "Disculpe, señor... Lo que pasó fue muy extraño. Vi a alguien desaparecer en el bosque. ¡Me tiene muy preocupada!" Hunter: "Mind if I join you?", "I'm Detective Hunter. I'm investigating some... unusual occurrences in town. And you are...?", "I'm sorry, miss... uh...?", "I'm sorry, miss... I really need to take this. Excuse me."
id
"gen-ZJMpeQDp4WhGNGJFJtYkP3Vzy0Zr"
total_cost
0.0007749
upstream_id
"5a7d7253-ccde-497a-8402-bab19d5c704a"
created_at
"2024-08-09T17:00:17.144777+00:00"
8/9/2024, 5:00:17 PM
model
"meta-llama/llama-3.1-70b-instruct"
app_id
182717
streamed
false
cancelled
false
provider_name
"Fireworks"
latency
23262
moderation_latency
(null)
generation_time
0
finish_reason
"stop"
tokens_prompt
319
tokens_completion
588
native_tokens_prompt
271
native_tokens_completion
590
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0007749
Evaluation details
Result
Evaluator
Details
Meta Data
75.00%
Parse dialogue
n/a
dialogue
0
name
"Maria"
dialogue
"I'm Detective Hunter. I'm investigating some... unusual occurrences in town. And you are...?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8094908209213716
tl
0.6423927178153446
ro
0.6232876712328766
fr
0.5788667687595712
it
0.555735056542811
pt
0.553933495539335
ca
0.49308755760368667
nl
0.492151431209603
de
0.48837209302325585
es
0.47267497603068076
sl
0.466537342386033
sv
0.44779116465863456
da
0.40021810250817885
ms
0.39890710382513656
no
0.39560439560439564
cs
0.3373493975903614
hr
0.33008526187576126
sq
0.32926829268292684
et
0.298469387755102
lv
0.28478543563068925
az
0.26861702127659576
yo
0.2537313432835821
is
0.2005813953488372
sk
0.2005813953488372
lt
0.1863905325443787
pl
0.13249211356466878
fi
0.09688013136288999
tr
0.09688013136288999
reliable
true
passes
false
1
name
"Hunter"
dialogue
"I'm sorry, miss... I really need to take this. Excuse me."
detectedLang
"en"
heavyLang
"en"
scores
en
0.7891566265060241
tl
0.6704331450094161
no
0.5358090185676392
it
0.5138888888888888
sq
0.5014245014245015
sl
0.5
pl
0.48377581120943947
hr
0.4760479041916168
da
0.4631901840490798
cs
0.460708782742681
et
0.4590417310664606
pt
0.4372990353697749
sk
0.4363929146537842
lv
0.41471571906354515
lt
0.396551724137931
yo
0.3902439024390244
ca
0.375
es
0.34579439252336447
ro
0.32038834951456313
ms
0.29006085192697767
nl
0.29006085192697767
sv
0.2827868852459016
az
0.2647058823529412
fr
0.20634920634920634
fi
0.1764705882352941
be
0.16666666666666666
tr
0.16666666666666666
hu
0.1606714628297362
de
0.15254237288135594
is
0.15254237288135594
uk
0.15254237288135594
vi
0.15254237288135594
reliable
true
passes
true
2
name
"Maria"
dialogue
"Disculpe, señor... Lo que pasó fue muy extraño. Vi a alguien desaparecer en el bosque. ¡Me tiene muy preocupada!"
detectedLang
"es"
heavyLang
"es"
scores
es
0.7966963151207116
pt
0.624765478424015
ca
0.6245893946503989
fr
0.5249406175771971
it
0.49141767323585506
ro
0.4513031550068587
nl
0.40828402366863903
no
0.3907083015993907
sv
0.35064935064935066
sl
0.3360995850622407
en
0.3333333333333333
ms
0.3197278911564626
da
0.31153184165232356
sk
0.31153184165232356
yo
0.300087489063867
fi
0.27140255009107467
hr
0.2647058823529412
hu
0.2599444958371878
tr
0.24812030075187969
vi
0.24599434495758718
et
0.24528301886792453
eu
0.22854387656702024
sq
0.21568627450980393
is
0.20713577799801783
cs
0.2007992007992008
pl
0.1935483870967742
de
0.15254237288135594
tl
0.14984059511158343
lt
0.09297052154195011
az
0.08045977011494253
lv
0.08045977011494253
reliable
true
passes
true
3
name
"Hunter"
dialogue
"I'm sorry, miss... I really need to take this. Excuse me."
detectedLang
"en"
heavyLang
"en"
scores
en
0.7891566265060241
tl
0.6704331450094161
no
0.5358090185676392
it
0.5138888888888888
sq
0.5014245014245015
sl
0.5
pl
0.48377581120943947
hr
0.4760479041916168
da
0.4631901840490798
cs
0.460708782742681
et
0.4590417310664606
pt
0.4372990353697749
sk
0.4363929146537842
lv
0.41471571906354515
lt
0.396551724137931
yo
0.3902439024390244
ca
0.375
es
0.34579439252336447
ro
0.32038834951456313
ms
0.29006085192697767
nl
0.29006085192697767
sv
0.2827868852459016
az
0.2647058823529412
fr
0.20634920634920634
fi
0.1764705882352941
be
0.16666666666666666
tr
0.16666666666666666
hu
0.1606714628297362
de
0.15254237288135594
is
0.15254237288135594
uk
0.15254237288135594
vi
0.15254237288135594
reliable
true
passes
true