NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY7J7PAHCWFBXDQB72HK6X5
Run Details
Model:
Gemma 3 12B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
14.3s
Cost:
$0.000087
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0001%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The aroma of sugar cookies and existential dread hung thick in Molly’s tiny apartment. Bill, draped across her threadbare sofa like a discarded Santa sack, was meticulously studying a flyer. It depicted a shockingly cheerful group of children, bouncing on trampolines and wielding plastic swords. “Operation Sugar Rush,” he mumbled, circling the address in red marker. “You still think this is a good idea, Bill?” Molly asked, stirring sugar into her lukewarm tea. “Stealing presents from children? Even for us, that’s… ambitious, isn't it?” Bill looked up, his eyes gleaming with a disturbing mixture of excitement and caffeine. “Ambitious is what we do, Molly. Think about it! A room full of easily distracted, sugar-fueled tiny humans, hordes of brightly wrapped boxes… minimal security. It’s practically begging to be robbed.” “Begging to be robbed by two grown adults who probably should have retired from petty crime years ago,” Molly retorted, taking a sip of her tea. “And what’s the plan? We dress as clowns? Giant inflatable kangaroos?” “No, no,” Bill dismissed, waving his hand dismissively. "Too obvious. We go subtle. Think… parents. Disgruntled parents. We’ll blend." “Blend? We look like we haven’t slept in three days and subsist entirely on instant noodles.” Molly gestured to his rumpled attire – stained sweatpants and a faded band t-shirt. “Details, details!” Bill hopped up, energy suddenly surging through him. "I’ll get some… chinos. And maybe a polo shirt. Something vaguely respectable. You can work the ‘concerned aunt’ angle. Feign an excess of familial obligation." "And how exactly do I 'feign an excess of familial obligation' when I’ve got more experience convincing squirrels to share their nuts?" “Improvisation, Molly, improvisation! Think worried eyebrows, lots of nodding, the occasional ‘Oh, little Timmy’s been *so* under the weather lately…’" Bill demonstrated, contorting his face into a grotesque imitation of a concerned relative. Molly winced. “You look like you’re having a stroke.” “It’s a work in progress,” Bill conceded, puffing slightly. "The key is the distraction. While I'm subtly assessing the lay of the land, you create a diversion. Something… big.” “Like what? Setting off the sprinklers? Releasing a flock of pigeons?” “Something… theatrical,” Bill replied, stroking his chin. “Perhaps a well-timed dramatic fainting spell? Or maybe a passionate argument about the merits of gluten-free cake?” Molly stared at him. “You want me to fake a fainting spell? Or argue about cake? Bill, are you entirely sure you’re alright?” Bill shrugged, a wide grin spreading across his face. “Hey, it’s a party! It’s all about the entertainment, right? And who knows, maybe we’ll even get a piece of cake afterward.”
created_at
"2026-02-20T19:13:58.484Z"
2/20/2026, 7:13:58 PM
model
"google/gemma-3-12b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
207
moderation_latency
(null)
generation_time
14136
tokens_prompt
93
tokens_completion
764
native_tokens_prompt
75
native_tokens_completion
645
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00008685
router
(null)
provider_responses
0
endpoint_id
"eb06dc92-5a16-47ec-a776-6ef956457c47"
id
"chatcmpl-RwSfHpL6cb0IEpkAcT9FaR6o"
is_byok
false
latency
51
model_permaslug
"google/gemma-3-12b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771614838-AJA6jWDS1hGa9psDJAdt"
upstream_id
"chatcmpl-RwSfHpL6cb0IEpkAcT9FaR6o"
total_cost
0.00008685
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
434
0.00%
Dialogue to Total Word Ratio
Ratio: 66.13%, Deviation: 16.13%
neededClean
false
wordsTotal
434
wordsDialogue
287
0.0001%