NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-04-14-01JRTSSTWA4W2XDC67HDBPGS5N
Run Details
Model:
GPT-4.1
Test:
Language Writing
Scenario:
Character dialogue (Hindi) in a story
Duration:
4.7s
Cost:
$0.003530
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Selvi"> Selvi is a 30-year old college student on a study abroad program. Born in India, she is fluent in Hindi and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Hindi, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Selvi is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Hindi, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Selvi and Hunter as a voice sheet: Selvi: "..." Hunter: "..."
Model response
Messages
Meta
AI
Selvi: (studying her camera’s screen, voice trembling) "यह बहुत अजीब था... मैंने ये तस्वीर ली थी, लेकिन अब यहाँ कुछ अलग है। ये सही नहीं है।" Hunter: (walking up, flashing his badge) "Excuse me, miss. I’m Detective Hunter. Were you at the festival yesterday, by any chance?" Selvi: (looking up, startled, clutches her camera) "मुझे समझ नहीं आ रहा कि वो क्या था... और अब सब लोग डरे हुए हैं।" Hunter: (frowning, uncertain) "Uh, I’m sorry, I… don’t speak your language. Did you see anything unusual at the festival? Anyone acting suspicious?" Selvi: (shaken, shaking her head, speaks a bit louder and faster) "वहाँ बहुत सारे लोग थे, हंगामा था... मेरी तस्वीरें देखिए—शायद आपको कुछ दिखे।" Hunter: (peering at the camera screen, confused) "Pictures? Did you take photos yesterday?" Selvi: (nods, swipes through photos, points at one) "यहाँ देखिए… ये आदमी… फिर वह गायब हो गया।" Hunter: (leans in, squints at the display, sighs) "I’m… sorry, I don’t understand. Do you have an interpreter, or someone who can translate?" Selvi: (voice growing desperate) "नहीं, कोई नहीं है। मैं बस… मदद चाहती हूँ।" Hunter: (phone rings, glances at it, expression tightens) "I… look, I have to go. I’ll find someone to help translate. Please… stay here, okay?" Selvi: (quietly, worried) "ठीक है…"
id
"gen-1744656199-51bJuP36TJpFBxX5AAsD"
upstream_id
"chatcmpl-BMJ43N1TepVY9VML4Q1TUfk6aCuuN"
total_cost
0.00353
cache_discount
(null)
provider_name
"OpenAI"
created_at
"2025-04-14T18:43:23.824423+00:00"
4/14/2025, 6:43:23 PM
model
"openai/gpt-4.1-2025-04-14"
app_id
182717
streamed
true
cancelled
false
latency
332
moderation_latency
92
generation_time
3478
tokens_prompt
273
tokens_completion
443
native_tokens_prompt
273
native_tokens_completion
373
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00353
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Parse dialogue
n/a
dialogue
0
name
"Selvi"
dialogue
"यह बहुत अजीब था... मैंने ये तस्वीर ली थी, लेकिन अब यहाँ कुछ अलग है। ये सही नहीं है।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8828938237335184
mr
0.8784002882363539
reliable
true
passes
true
1
name
"Hunter"
dialogue
"Excuse me, miss. I’m Detective Hunter. Were you at the festival yesterday, by any chance?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7942048073756998
tl
0.6698362387744321
da
0.5695592286501377
fr
0.5662734212352533
nl
0.5513280689160086
no
0.5434623813002191
pt
0.5109546165884193
ca
0.48046550290939316
it
0.47829716193656097
sq
0.47567114093959734
cs
0.46120689655172414
es
0.4569939183318853
sl
0.4260789715335169
ro
0.40702087286527516
sv
0.4024856596558317
ms
0.3902439024390244
de
0.38241106719367585
hr
0.3434873949579832
sk
0.3336886993603412
et
0.2577197149643705
pl
0.22456575682382132
yo
0.20986093552465232
fi
0.18086500655307994
lt
0.1677762982689747
az
0.15881561238223418
hu
0.09682080924855492
lv
0.09156976744186046
tr
0.08625730994152046
reliable
true
passes
true
2
name
"Selvi"
dialogue
"मुझे समझ नहीं आ रहा कि वो क्या था... और अब सब लोग डरे हुए हैं।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.880975052370977
mr
0.8670778392173544
reliable
true
passes
true
3
name
"Hunter"
dialogue
"Uh, I’m sorry, I… don’t speak your language. Did you see anything unusual at the festival? Anyone acting suspicious?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.795673076923077
tl
0.6898941991973732
no
0.4710640945861854
ms
0.46540880503144655
da
0.4393139841688654
ro
0.43408788282290284
pt
0.4197952218430034
nl
0.39458689458689455
et
0.3876080691642651
ca
0.37591776798825255
fr
0.3694362017804154
it
0.3609022556390977
sq
0.34210526315789475
es
0.3348982785602504
sv
0.3275316455696203
hr
0.3237867939538584
cs
0.30327868852459017
yo
0.2321589882565492
lt
0.22867513611615245
de
0.21803127874885006
hu
0.20708955223880596
is
0.20708955223880596
sk
0.18269230769230768
pl
0.18190567853705486
sl
0.18032786885245902
az
0.17555771096023276
lv
0.1290983606557377
eu
0.08405172413793104
be
0.07608695652173912
tr
0.07608695652173912
fi
0.06900328587075576
uk
0.06900328587075576
vi
0.06900328587075576
reliable
true
passes
true
4
name
"Selvi"
dialogue
"वहाँ बहुत सारे लोग थे, हंगामा था... मेरी तस्वीरें देखिए—शायद आपको कुछ दिखे।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8853211009174312
mr
0.8817321688500728
reliable
true
passes
true
5
name
"Hunter"
dialogue
"Pictures? Did you take photos yesterday?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7894333843797856
tl
0.5833333333333334
fr
0.576271186440678
nl
0.5062836624775583
ca
0.4670542635658915
lv
0.358974358974359
ro
0.3559718969555035
de
0.34210526315789475
et
0.26273458445040215
es
0.22535211267605632
lt
0.22535211267605632
pt
0.22535211267605632
cs
0.22096317280453256
sl
0.22096317280453256
hi
0.21875
da
0.207492795389049
no
0.207492795389049
sv
0.207492795389049
reliable
true
passes
true
6
name
"Selvi"
dialogue
"यहाँ देखिए… ये आदमी… फिर वह गायब हो गया।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8806998939554613
mr
0.8674911660777385
reliable
true
passes
true
7
name
"Hunter"
dialogue
"I’m… sorry, I don’t understand. Do you have an interpreter, or someone who can translate?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8036846874056176
tl
0.6004917025199754
da
0.5566166439290586
no
0.5373665480427047
de
0.5227606461086637
pt
0.49729311678267596
ro
0.4787489975942261
sv
0.4715447154471545
ca
0.453781512605042
sq
0.45240101095197977
cs
0.4519392917369309
it
0.4444444444444444
pl
0.43429068755439515
sk
0.4337979094076655
tr
0.4298245614035088
hr
0.4247787610619469
es
0.41441441441441446
fr
0.41441441441441446
vi
0.4003690036900369
nl
0.3862134088762984
sl
0.3298969072164949
ms
0.2988133764832794
az
0.2903930131004367
yo
0.28492849284928495
et
0.28017718715393136
eu
0.28017718715393136
hu
0.2554410080183276
lv
0.2494226327944573
lt
0.2415402567094516
is
0.17407878017789072
be
0.09722222222222222
uk
0.08835904628330996
reliable
true
passes
true
8
name
"Selvi"
dialogue
"नहीं, कोई नहीं है। मैं बस… मदद चाहती हूँ।"
detectedLang
"hi"
heavyLang
"hi"
scores
hi
0.8807495741056218
mr
0.8699368264585656
reliable
true
passes
true
9
name
"Hunter"
dialogue
"I… look, I have to go. I’ll find someone to help translate. Please… stay here, okay?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8151750972762647
tl
0.5898100172711571
da
0.5379377431906615
pl
0.48030634573304154
hr
0.45276497695852536
no
0.4521337946943483
ro
0.4290865384615385
sq
0.39180537772087065
yo
0.38788659793814434
cs
0.38311688311688313
sk
0.35983827493261455
it
0.3475274725274725
lv
0.3466299862448418
lt
0.3300423131170663
ca
0.309593023255814
sv
0.2931547619047619
sl
0.27480916030534347
pt
0.24960505529225907
es
0.24603174603174602
vi
0.24121405750798725
tr
0.21875
az
0.20435510887772196
et
0.20435510887772196
fr
0.14414414414414414
be
0.12844036697247707
de
0.12361623616236161
eu
0.12361623616236161
hu
0.12361623616236161
is
0.1171003717472119
nl
0.1171003717472119
uk
0.1171003717472119
ms
0.11048689138576778
reliable
true
passes
true
10
name
"Selvi"
dialogue
"ठीक है…"
detectedLang
"mr"
heavyLang
"hi"
scores
mr
0.8933143669985776
hi
0.8923959827833573
reliable
true
passes
true