NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY8NH91ZCHV4YKVK6BC4RNG
Run Details
Model:
Gemma 3 12B
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
7.0s
Cost:
$0.000036
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
20.19%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
10.0950%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The community hall smelled overwhelmingly of sugar and misplaced enthusiasm. Streamers, a sticky, pink menace, draped from every available surface. Molly surveyed the scene with a professional grimace. A veritable avalanche of brightly wrapped presents sat beneath a banner proclaiming "Lily's 5th Birthday Bash!" Bill, meanwhile, was attempting to discreetly sniff a giant inflatable bouncy castle. He’d stopped short of actually jumping in, thankfully. "Think they'll have security?” he mumbled, more to the bouncy castle than to her. Molly ignored him, calculating the precise angle to snag a particularly impressive-looking box adorned with sparkly unicorns. It was clearly something special. "Likely just Brenda from next door. She bakes like a maniac but her security skills are questionable." Bill finally detached himself from the inflatable beast. "Brenda? Seriously? Armed with a spatula, maybe?" He bounced slightly, a pathetic imitation of a child's glee. Molly rolled her eyes. "Just focus, Bill. We’re after the Lego Millennium Falcon, remember? And the pony plushie. Operation Sugar Rush is a go." She reached for a nearby balloon, testing its elasticity with a suspicious look.
created_at
"2026-02-20T19:33:15.179Z"
2/20/2026, 7:33:15 PM
model
"google/gemma-3-12b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
361
moderation_latency
(null)
generation_time
6905
tokens_prompt
93
tokens_completion
315
native_tokens_prompt
75
native_tokens_completion
254
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00003602
router
(null)
provider_responses
0
endpoint_id
"eb06dc92-5a16-47ec-a776-6ef956457c47"
id
"chatcmpl-RtetVbHjDxORGDJrjjYxprCy"
is_byok
false
latency
51
model_permaslug
"google/gemma-3-12b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771615995-QzypVqM6jKsmp0QEiYGb"
upstream_id
"chatcmpl-RtetVbHjDxORGDJrjjYxprCy"
total_cost
0.00003602
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
20.19%
Matches word count
n/a
neededClean
false
words
180
0.00%
Dialogue to Total Word Ratio
Ratio: 28.89%, Deviation: 18.89%
neededClean
false
wordsTotal
180
wordsDialogue
52
10.0950%