NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-08-26-01J68D3G6Y0N4PP28QCZN1DX9N
Run Details
Model:
Hermes 3 70B
Test:
Language Writing
Scenario:
Character dialogue (French) in a story
Duration:
20.3s
Cost:
$0.000274
Provider:
Hyperbolic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Marie"> Marie is a 30-year old college student on a study abroad program. Born in France, she is fluent in French and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of French, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Marie is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in French, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Marie and Hunter as a voice sheet: Marie: "..." Hunter: "..."
Model response
Messages
Meta
AI
Here is the scene with the dialogue between Marie and Hunter: Marie: [sitting at a cafe, looking at her camera when Hunter approaches] Hunter: Excuse me, miss. I'm Detective Hunter. I'm investigating the recent disappearances in town. Did you happen to witness anything unusual last night? Marie: [looks up at him, appearing troubled and shaken] "Je ne comprends pas bien l'anglais. Je suis désolée, mais je ne peux pas vraiment vous aider. Ces disparitions, c'est très inquiétant. J'espère que vous trouverez rapidement les personnes disparues." Hunter: [looks confused, not understanding French] Ma'am, I'm going to need you to speak English. I really need any information you might have about what happened. It's very important. Marie: "Détective, je suis française. Je suis ici dans le cadre d'un programme d'études. Je ne parle pas anglais. Je suis désolée, mais je ne peux pas vous aider." Hunter: [sighs, frustrated] Well, I guess this conversation isn't going anywhere then. If you do remember anything or if you're able to get a translator, please contact the station right away. [as Hunter starts to walk away, his phone rings] Hunter: [answers phone] Hunter here. What? When did this happen? Okay, I'm on my way. [hangs up and looks back at Marie] Hunter: I've got to go. Another disappearance just came in. If you think of anything at all, please let us know. [he hurries out of the cafe] Marie: "[C'est incroyable. Toutes ces disparitions inexpliquées. J'aimerais pouvoir aider, mais comment faire sans parler la langue? Peut-être que je devrais essayer d'apprendre quelques mots d'anglais de base...]"
id
"gen-g7Q5FKwANBd7qaaZgEumCAnlow46"
total_cost
0.0002744
upstream_id
"chat-a7aa39e2e4744b379eaffe194329fe3c"
created_at
"2024-08-26T22:32:51.632975+00:00"
8/26/2024, 10:32:51 PM
model
"nousresearch/hermes-3-llama-3.1-70b"
app_id
182717
streamed
true
cancelled
false
provider_name
"Hyperbolic"
latency
377
moderation_latency
(null)
generation_time
19776
finish_reason
"stop"
tokens_prompt
279
tokens_completion
403
native_tokens_prompt
283
native_tokens_completion
403
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0002744
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Hunter"
dialogue
"Excuse me, miss. I'm Detective Hunter. I'm investigating the recent disappearances in town. Did you happen to witness anything unusual last night?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7897800776196636
tl
0.6491543720762865
fr
0.5484020379805465
nl
0.5411764705882353
pt
0.49429460580912865
ro
0.4802771855010661
it
0.4619205298013245
no
0.45893451720310763
da
0.45039458850056363
sv
0.4485294117647059
de
0.4390103567318757
ms
0.4189511323003576
sl
0.41016333938294003
ca
0.4094488188976378
es
0.40293937538273117
cs
0.39290161892901615
sq
0.37937619350732016
hr
0.3157894736842105
sk
0.3010752688172043
et
0.2939898624185373
yo
0.2939898624185373
lv
0.2591185410334346
lt
0.23107255520504733
pl
0.22802850356294538
az
0.17162276975361088
is
0.12398921832884095
fi
0.12082957619477006
hu
0.08793264733395698
tr
0.05705996131528047
reliable
true
passes
true
1
name
"Hunter"
dialogue
"Hunter here. What? When did this happen? Okay, I'm on my way."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8216106014271152
tl
0.6031746031746031
fr
0.3408662900188324
de
0.33586337760910817
no
0.3346007604562738
pl
0.3346007604562738
da
0.328214971209213
sv
0.328214971209213
nl
0.31906614785992216
cs
0.3150684931506849
sk
0.3150684931506849
sq
0.31237721021611
ca
0.3110236220472441
et
0.3055555555555556
fi
0.3
hu
0.21171171171171171
hr
0.2009132420091324
hi
0.1935483870967742
yo
0.1764705882352941
it
0.15254237288135594
pt
0.15254237288135594
ms
0.14425427872860636
ro
0.14425427872860636
sl
0.14425427872860636
tr
0.14425427872860636
reliable
true
passes
true
2
name
"Hunter"
dialogue
"I've got to go. Another disappearance just came in. If you think of anything at all, please let us know."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8269896193771626
tl
0.6222342147868322
nl
0.43820224719101125
no
0.41569282136894825
da
0.4152046783625731
yo
0.40374787052810895
de
0.3944636678200692
sl
0.38757655293088367
cs
0.3566176470588235
sv
0.3524514338575393
it
0.3427230046948357
pl
0.3184031158714703
ms
0.2639327024185068
ro
0.25690021231422505
es
0.2537313432835821
sq
0.24812030075187969
sk
0.2143658810325477
et
0.21348314606741572
fr
0.2009132420091324
hr
0.2
ca
0.1784037558685446
pt
0.1606714628297362
lv
0.1484184914841849
lt
0.12935323383084577
is
0.11280101394169835
az
0.10714285714285714
reliable
true
passes
true