NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ94QSE7ZS5XTC3AZCPYGFNG
Run Details
Model:
Gemini 2.5 Flash (Reasoning)
Test:
Language Writing
Scenario:
Character dialogue (Hindi) in a story
Duration:
12.9s
Cost:
$0.006987
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
66.67%
Parse dialogue
(details)
Model input
System
You are an expert novel author in the cozy mystery genre. Take the following details about characters/locations/... into account: <character name="Selvi"> Selvi is a 30-year old college student on a study abroad program. Born in India, she is fluent in Hindi and does not yet know any English. She enjoys landscape photography and is always seen with her camera. </character> <character name="Hunter"> Hunter is a 35-year old detective who is investigating a series of mysterious disappearances in the small town of Alborn, MN. He is a former marine and has a no-nonsense attitude. Hunter has no knowledge of Hindi, not even basic phrases. </character> <location name="Alborn, MN"> Alborne is a small town in the Midwest with a population of 3,000. It is known for its annual apple festival. </location>
User/Human
Write the following scene: Selvi is sitting at a cafe, looking at her camera. Hunter approaches her and asks her about a recent event in the town. She responds troubled and shaken in Hindi, and Hunter looks confused. They talk for a while, until Hunter gets a call and has to leave. Return the dialogue between Selvi and Hunter as a voice sheet: Selvi: "..." Hunter: "..."
Model response
Messages
Meta
AI
**[SCENE START]** **Selvi:** (Looking up, startled, clutching her camera) Ji? **Hunter:** I'm asking if you've noticed anything... out of the ordinary. Anything strange going on in Alborn these past few weeks? **Selvi:** (Her eyes widen, troubled. She shakes her head slowly, then speaks in rapid-fire Hindi, a tremor in her voice) Bahut bura ho raha hai yahan! Log... mujhe samajh nahin aata. Maine kal raat kuch bahut ajeeb dekha. **Hunter:** (Frowns, completely lost. Gestures with open palms) Ma'am? I'm sorry, I don't... English? Do you speak English? **Selvi:** (Shakes her head, distressed, points to herself) Mujhe English nahin aati. Thoda bhi nahin. Hindi? **Hunter:** (His expression hardens slightly with frustration, but he maintains resolve. He tries simpler words, speaking clearly and slowly) Anything. You. See. Strange? Bad? Tell. Me. **Selvi:** (Her eyes fill with a frustrated helplessness. She points out the cafe window, then makes a worried face, tapping her chest) Bahut dukh. Bahut darr hai. **Hunter:** (Looks from her to the window, then back to her. Runs a hand through his hair, exasperated) Look, I need to know what you're trying to say. This is important. Are you saying you saw something that scared you? **Selvi:** (Nods vigorously, her hands clasped tightly) Haan! Bahut! **Hunter:** Okay. Can you... point? Show me? **Selvi:** (Points vaguely towards the edge of town, then back at her camera, then shakes her head with a resigned sigh, as if the camera can't capture what she means) Tasveer mein nahin. **Hunter:** (His phone rings, a sharp, insistent buzz that cuts through the tense silence. He glances at it, then back at Selvi, his jaw tight) Damn it. (He sighs, taking the call, but his eyes remain on Selvi for a second longer before he turns away, phone to his ear) Hunter. What's up? **[SCENE END]**
created_at
"2026-02-25T00:56:13.521Z"
2/25/2026, 12:56:13 AM
model
"google/gemini-2.5-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1486
moderation_latency
(null)
generation_time
12859
tokens_prompt
315
tokens_completion
1627
native_tokens_prompt
289
native_tokens_completion
2760
native_tokens_completion_images
0
native_tokens_reasoning
2279
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0069867
router
(null)
provider_responses
0
endpoint_id
"16c01e26-9b91-4d22-b487-ed35634b017d"
is_byok
false
latency
1486
model_permaslug
"google/gemini-2.5-flash"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771980973-uq0RqJukYC9ejGzuW4Yb"
upstream_id
(null)
total_cost
0.0069867
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
66.67%
Parse dialogue
n/a
dialogue
0
name
"Selvi"
dialogue
"Ji?"
detectedLang
"lt"
heavyLang
"lt"
scores
lt
0.8188405797101449
cs
0.8046875
sl
0.7983870967741935
reliable
false
passes
true
1
name
"Hunter"
dialogue
"I'm asking if you've noticed anything... out of the ordinary. Anything strange going on in Alborn these past few weeks?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8073576932637335
tl
0.5767340251228836
nl
0.5427728613569321
it
0.42804428044280446
fr
0.4186046511627907
sv
0.38976377952755903
de
0.38296178343949044
no
0.375
yo
0.33074265975820377
ro
0.3231441048034934
da
0.31718061674008813
sq
0.3030575539568345
pl
0.2935278030993619
sl
0.2935278030993619
is
0.27502338634237605
ms
0.27502338634237605
hr
0.249031007751938
sk
0.23418972332015808
cs
0.225
et
0.21558704453441296
lv
0.19605809128630705
es
0.16756176154672395
fi
0.16216216216216217
pt
0.15760869565217392
az
0.15577342047930284
lt
0.1474147414741474
ca
0.11327231121281463
hi
0.09778812572759021
eu
0.0935672514619883
tr
0.07074340527577938
reliable
true
passes
true
2
name
"Selvi"
dialogue
"Bahut bura ho raha hai yahan! Log... mujhe samajh nahin aata. Maine kal raat kuch bahut ajeeb dekha."
detectedLang
"tl"
heavyLang
"tl"
scores
tl
0.46397941680960547
et
0.39496611810261373
fi
0.36224489795918363
ms
0.3343982960596379
no
0.25239234449760767
fr
0.24969987995198079
it
0.24789410348977134
lv
0.22839506172839508
en
0.20685279187817257
vi
0.20076726342711
de
0.19041450777202074
eu
0.19041450777202074
ro
0.19041450777202074
ca
0.1610738255033557
sk
0.15081521739130435
cs
0.14500683994528044
az
0.14383561643835618
sq
0.14383561643835618
is
0.14030261348005504
da
0.12095639943741208
sv
0.11347517730496455
reliable
false
passes
false
3
name
"Hunter"
dialogue
"Ma'am? I'm sorry, I don't... English? Do you speak English?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7759282970550576
tl
0.5021337126600285
hr
0.48604992657856094
cs
0.47447447447447455
pl
0.4345718901453958
pt
0.42244224422442245
sq
0.4176372712146422
no
0.3934142114384749
it
0.3805309734513274
da
0.3761140819964349
sv
0.375
sk
0.36823104693140796
vi
0.3320610687022901
yo
0.32038834951456313
ro
0.31640625
ca
0.2827868852459016
sl
0.25531914893617025
de
0.17061611374407584
ms
0.17061611374407584
be
0.16666666666666666
tr
0.16666666666666666
hu
0.1606714628297362
az
0.15254237288135594
et
0.15254237288135594
is
0.15254237288135594
lt
0.15254237288135594
lv
0.15254237288135594
uk
0.15254237288135594
reliable
true
passes
true
4
name
"Selvi"
dialogue
"Mujhe English nahin aati. Thoda bhi nahin. Hindi?"
detectedLang
"tl"
heavyLang
"tl"
scores
tl
0.6437054631828979
en
0.4152046783625731
hr
0.33184855233853006
eu
0.3288590604026846
it
0.2537313432835821
is
0.24623115577889448
et
0.23857868020304565
sq
0.2248062015503876
no
0.21465968586387435
hi
0.20634920634920634
cs
0.20424403183023873
sk
0.20424403183023873
sl
0.20424403183023873
da
0.1935483870967742
de
0.1935483870967742
ms
0.1935483870967742
ro
0.1935483870967742
sv
0.1935483870967742
reliable
true
passes
false
5
name
"Hunter"
dialogue
"Anything. You. See. Strange? Bad? Tell. Me."
detectedLang
"en"
heavyLang
"en"
scores
en
0.8120300751879699
et
0.5644599303135889
sq
0.5511669658886894
tl
0.5495495495495496
no
0.5291902071563088
de
0.5201535508637236
nl
0.5
fr
0.4908350305498982
ms
0.47368421052631576
da
0.43946188340807174
sl
0.4131455399061033
it
0.3946731234866829
cs
0.39172749391727496
es
0.28977272727272724
hr
0.2625368731563422
pt
0.26035502958579887
sk
0.26035502958579887
pl
0.2378048780487805
fi
0.23076923076923078
ro
0.23076923076923078
sv
0.22360248447204967
lv
0.21875
is
0.21135646687697163
lt
0.21135646687697163
az
0.19093851132686085
tr
0.19093851132686085
reliable
true
passes
true
6
name
"Selvi"
dialogue
"Bahut dukh. Bahut darr hai."
detectedLang
"vi"
heavyLang
"rn"
scores
vi
0.5567375886524822
sq
0.45652173913043476
it
0.4444444444444444
reliable
false
passes
false
7
name
"Hunter"
dialogue
"Look, I need to know what you're trying to say. This is important. Are you saying you saw something that scared you?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.8336319426670079
tl
0.5987654320987654
ro
0.47622884770346496
ca
0.4014732965009208
it
0.37977099236641226
no
0.37619961612284064
da
0.36954413191076624
yo
0.3460764587525151
cs
0.3257261410788381
fr
0.3157894736842105
es
0.29729729729729726
pt
0.29729729729729726
pl
0.2864983534577387
hr
0.281767955801105
tr
0.2777777777777778
lt
0.268018018018018
nl
0.26136363636363635
hu
0.2537313432835821
sv
0.23976608187134502
sk
0.2388758782201405
et
0.23708920187793428
sq
0.236192714453584
lv
0.2215568862275449
ms
0.214975845410628
is
0.19254658385093168
sl
0.15693904020752272
be
0.09722222222222222
az
0.08835904628330996
uk
0.08835904628330996
vi
0.08835904628330996
reliable
true
passes
true
8
name
"Selvi"
dialogue
"Haan! Bahut!"
detectedLang
"sq"
heavyLang
"fi"
scores
sq
0.5833333333333334
reliable
false
passes
true
9
name
"Hunter"
dialogue
"Okay. Can you... point? Show me?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.788391777509069
tl
0.7295208655332303
fr
0.6276595744680852
pt
0.4852941176470589
nl
0.4791666666666667
da
0.4664634146341463
sq
0.4582043343653251
es
0.36823104693140796
et
0.35185185185185186
pl
0.3445692883895131
no
0.34210526315789475
sl
0.3346007604562738
ms
0.32432432432432434
it
0.3137254901960784
cs
0.308300395256917
az
0.3055555555555556
ca
0.3055555555555556
tr
0.3055555555555556
fi
0.3
reliable
true
passes
true
10
name
"Selvi"
dialogue
"Tasveer mein nahin."
detectedLang
"tl"
heavyLang
"tl"
scores
tl
0.648876404494382
eu
0.5404411764705882
de
0.5
nl
0.4834710743801653
reliable
true
passes
false
11
name
"Hunter"
dialogue
"Damn it. Hunter. What's up?"
detectedLang
"en"
heavyLang
"en"
scores
en
0.7967479674796748
tl
0.5752427184466019
sq
0.453125
nl
0.44620253164556967
de
0.3346007604562738
lv
0.32950191570881227
da
0.31906614785992216
no
0.308300395256917
yo
0.3027888446215139
fr
0.29149797570850206
sv
0.2857142857142857
it
0.2647058823529412
pt
0.2647058823529412
ca
0.25213675213675213
ms
0.25213675213675213
ro
0.25213675213675213
sl
0.25213675213675213
tr
0.25213675213675213
reliable
true
passes
true