NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-11-04-01JBW8RPXVDPQJVC03Y34A69HA
Run Details
Model:
Claude 3.5 Haiku
Test:
Language Writing
Scenario:
Character dialogue (Italian) in a story
Duration:
4.8s
Cost:
$0.001290
Evaluation summary
Score
Evaluator
Details Link
85.71%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Isabella"> Isabella is a 30-year old college student on a study abroad program. Born in Italy, she is fluent in Italian and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Italian, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Isabella is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Italian, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Isabella and Hunter as a voice sheet: Isabella: "..." Hunter: "..."
Model response
Messages
Meta
AI
Here's the voice sheet for the scene, taking into account the characters' backgrounds: Isabella: [Speaking in Italian, looking distressed] "Era terribile... non riesco a credere cosa è successo." (It was terrible... I can't believe what happened.) Hunter: [Leaning in, looking confused] "I'm sorry, what? I don't understand Italian." Isabella: [Looks up, realizes Hunter doesn't understand, tries to mime something about her camera] "Foto... strange... night..." Hunter: [Frustrated] "Can you speak English? I'm trying to understand what you saw." Isabella: [Shakes head, clearly doesn't know English, looks anxious and points at her camera] Hunter: [Sighs, tries speaking slowly] "What. Did. You. See?" [Isabella looks increasingly frustrated, unable to communicate] [Suddenly, Hunter's phone rings] Hunter: "Excuse me." [Answers phone] "Detective Williams." [Brief pause] Hunter: [To Isabella] "I'm sorry, I have to go. We'll figure this out later." [Hunter leaves, Isabella looks bewildered and worried]
total_cost
0.0012896
Evaluation details
Result
Evaluator
Details
Meta Data
85.71%
Parse dialogue
n/a
dialogue
0
name
"Isabella"
dialogue
"Era terribile... non riesco a credere cosa è successo."
detectedLang
"it"
heavyLang
"it"
scores
it
0.8119358074222668
en
0.6553308823529411
ca
0.6355685131195336
pt
0.6283448959365708
ro
0.6253746253746254
es
0.604012671594509
fr
0.5950323974082073
da
0.5142487046632124
de
0.506578947368421
nl
0.47257383966244726
sv
0.4204018547140649
yo
0.40664556962025317
eu
0.3951612903225807
no
0.3811881188118812
hu
0.3686868686868687
hr
0.3315508021390375
pl
0.3042671614100185
sl
0.3042671614100185
tr
0.2870722433460076
sq
0.27466150870406186
ms
0.2647058823529412
az
0.2618110236220473
cs
0.26035502958579887
sk
0.25742574257425743
et
0.251497005988024
fi
0.2454728370221328
lt
0.17943107221006563
is
0.16666666666666666
tl
0.1610738255033557
lv
0.15730337078651685
vi
0.15730337078651685
reliable
true
passes
true
1
name
"Hunter"
dialogue
"I'm sorry, what? I don't understand Italian."
detectedLang
"en"
heavyLang
"en"
scores
en
0.7890295358649789
no
0.6273291925465838
da
0.6245306633291614
sv
0.576271186440678
tl
0.5230524642289348
ro
0.5121951219512195
it
0.5008319467554077
sq
0.48364888123924266
de
0.460431654676259
cs
0.45255474452554745
nl
0.44547134935304994
pl
0.40476190476190477
hr
0.39759036144578314
sk
0.3492407809110629
pt
0.3464052287581699
yo
0.3435448577680525
ms
0.3333333333333333
ca
0.3150684931506849
is
0.3135011441647597
fi
0.3087557603686636
tr
0.3006993006993007
lt
0.29577464788732394
az
0.2890995260663507
be
0.1891891891891892
sl
0.1891891891891892
hu
0.18256130790190736
es
0.17355371900826447
et
0.17355371900826447
fr
0.17355371900826447
lv
0.17355371900826447
uk
0.17355371900826447
vi
0.17355371900826447
reliable
true
passes
true
2
name
"Isabella"
dialogue
"Foto... strange... night..."
detectedLang
"en"
heavyLang
"en"
scores
en
0.7706422018348624
no
0.7346072186836519
da
0.7311827956989247
tl
0.7276688453159041
de
0.7113163972286375
nl
0.6468926553672316
fr
0.6418338108882521
sl
0.5689655172413793
sk
0.5659722222222222
hr
0.5421245421245421
it
0.5387453874538745
cs
0.5318352059925093
pl
0.5300751879699248
ro
0.5247148288973384
lv
0.5155038759689923
et
0.48559670781893005
yo
0.3718592964824121
sv
0.3654822335025381
is
0.34895833333333337
lt
0.34895833333333337
ms
0.33510638297872336
az
0.3206521739130435
ca
0.3206521739130435
es
0.3206521739130435
pt
0.3206521739130435
sq
0.3206521739130435
tr
0.3206521739130435
reliable
true
passes
false
3
name
"Hunter"
dialogue
"Can you speak English? I'm trying to understand what you saw."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8226950354609929
no
0.5654692931633836
da
0.5598591549295775
tl
0.5376078914919853
cs
0.438622754491018
sv
0.43009118541033436
de
0.4140625
sk
0.37080536912751677
pl
0.36224489795918363
hr
0.3339253996447602
nl
0.3218806509945751
ms
0.31444241316270566
ro
0.28023032629558536
yo
0.267578125
lv
0.2454728370221328
sl
0.24395161290322584
lt
0.21711899791231734
az
0.17035398230088497
ca
0.17035398230088497
tr
0.17035398230088497
is
0.1648106904231626
reliable
true
passes
true
4
name
"Hunter"
dialogue
"What. Did. You. See?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8475609756097561
et
0.6296296296296297
cs
0.43820224719101125
tl
0.43820224719101125
reliable
true
passes
true
5
name
"Hunter"
dialogue
"Detective Williams."
detectedLang
"en"
heavyLang
"en"
scores
en
0.7857142857142857
tl
0.7232472324723247
yo
0.6359223300970874
it
0.6332518337408314
ro
0.6268656716417911
fr
0.5626822157434402
lt
0.5575221238938053
nl
0.5508982035928144
ca
0.5454545454545454
es
0.5426829268292682
pt
0.53125
de
0.5253164556962026
da
0.5098039215686274
no
0.5033112582781457
sl
0.4809688581314879
sq
0.38271604938271603
cs
0.32432432432432434
hr
0.3087557603686636
sv
0.3087557603686636
reliable
true
passes
true
6
name
"Hunter"
dialogue
"I'm sorry, I have to go. We'll figure this out later."
detectedLang
"en"
heavyLang
"en"
scores
en
0.7905027932960894
tl
0.6196754563894523
da
0.5959051724137931
pl
0.5426829268292682
no
0.5247148288973384
yo
0.44526627218934917
it
0.43095599393019723
hr
0.40476190476190477
pt
0.3902439024390244
ro
0.3872549019607844
cs
0.3862520458265139
de
0.375
sv
0.37395659432387307
fr
0.3676222596964587
ca
0.3644067796610169
sq
0.36006825938566556
nl
0.3578767123287671
sk
0.35233160621761656
lv
0.33035714285714285
lt
0.3081180811808118
sl
0.24395161290322584
hi
0.16851441241685144
be
0.15730337078651685
tr
0.15730337078651685
es
0.15540540540540537
hu
0.1515837104072398
az
0.14383561643835618
et
0.14383561643835618
is
0.14383561643835618
uk
0.14383561643835618
vi
0.14383561643835618
reliable
true
passes
true