NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY7KXK9P9HV6P41ZGG5XHRR
Run Details
Model:
Gemma 3 4B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
8.3s
Cost:
$0.000061
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0424%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the corrugated iron roof of the abandoned mill, a frantic, insistent percussion that mirrored Molly’s anxiety. She chewed on the end of a pen, scattering graphite dust across a sprawling map of Oakhaven’s Willow Creek Estates. Bill, sprawled on a rusty drum kit, was meticulously polishing a pair of miniature bolt cutters. “Seriously, Bill, you’re applying lipstick to those things?” she asked, her voice tight. “A gentleman must present his tools with dignity, Molly,” he replied, examining the cutters with a critical eye. “Besides, these are state-of-the-art. Micro-serrated. Guaranteed to slice through reinforced ribbon. You wouldn’t want to be caught with dull equipment, would you?” “I’d rather be caught with anything other than being trapped in a bouncy castle with eight-year-olds,” she muttered, pointing to a circled area on the map. “Number 47, Mrs. Higgins’ ‘Rainbow Extravaganza.’ Three dozen presents. Mostly stuffed animals. Apparently, Timmy Higgins is obsessed with a badger.” “A badger,” Bill repeated, a slight shudder rippling through him. “Right. So, logistical questions. Security, predictably, will be minimal. Parents are distracted. Balloons are… tempting. We need a distraction, Molly. Something… memorable.” “I was thinking glitter,” she said, a mischievous glint in her eye. “A strategically placed, shimmering cloud of iridescent pink glitter. Disrupt their vision. Cause a minor panic.” “Glitter? Seriously? We're stealing children's presents, not staging a rave.” “It’s effective, Bill! And festive! Think about it – chaos, sparkle, a bewildered collection of toddlers.” “I’m thinking about the potential for a lawsuit,” he said, returning to his polishing. “What about the windows? They’re alarmed.” “I’ve disabled the system,” she announced, tapping the map. “Small electromagnetic pulse. It’ll take about thirty seconds. Long enough for us to be in and out.” “Thirty seconds to navigate a suburban backyard crawling with unsupervised adults, armed with cake and questionable juice boxes? You're optimistic, Molly.” “Optimism is our business, Bill. Speaking of which, what’s the cover story?” He sighed dramatically. “I’m going with ‘quality control inspectors for inflatable animals.’ It’s plausible, slightly ridiculous, and avoids the thorny issue of ‘why are two strangers dressed as slightly disgruntled accountants entering a children’s birthday party.’” “Perfect. You wear the accountant glasses. The ones with the little taped-on numbers. And the beige suit. Absolutely no bright colors.” “What about you?” “I’m going as…” she paused, considering, “a particularly observant unicorn.” Bill stared at her. "A unicorn? Really, Molly?” “Don't underestimate the power of a sparkly horn, Bill. It’s surprisingly intimidating to a five-year-old. Now, let’s go over the timing one more time. The glitter deployment at 7:17. The pulse at 7:18. In and out by 7:28. We need to grab the badger and vanish.” “And if we get caught?” “Then we blame the glitter,” she said with a shrug. “It’s always the glitter.”
created_at
"2026-02-20T19:14:53.685Z"
2/20/2026, 7:14:53 PM
model
"google/gemma-3-4b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
135
moderation_latency
(null)
generation_time
8188
tokens_prompt
93
tokens_completion
862
native_tokens_prompt
75
native_tokens_completion
719
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00006052
router
(null)
provider_responses
0
endpoint_id
"d3de3bd4-81bc-48fb-924f-2a87b2a36e75"
id
"chatcmpl-Rhv6SNoNo58WfeSABk5mPky2"
is_byok
false
latency
53
model_permaslug
"google/gemma-3-4b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771614893-FhotxAqrObN8ekriS6d1"
upstream_id
"chatcmpl-Rhv6SNoNo58WfeSABk5mPky2"
total_cost
0.00006052
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
471
0.00%
Dialogue to Total Word Ratio
Ratio: 74.10%, Deviation: 44.10%
neededClean
false
wordsTotal
471
wordsDialogue
349
0.0424%