NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY8HK2T9AZJT57ZFGRCRF59
Run Details
Model:
Gemma 3 4B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
7.2s
Cost:
$0.000054
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the corrugated iron roof of the abandoned laundromat, a relentless, grey percussion to our planning. Molly, perched precariously on an overturned washing machine, was meticulously sketching a floor plan on a greasy napkin. Across from her, Bill was attempting to balance a rubber duck on his forehead. “Seriously, Bill?” she finally said, not looking up. “What’s with the avian headwear?” “Distraction,” he announced, the duck wobbling dangerously. “If the kids see a man with a duck balancing on his head, they’ll assume it’s a magician. They’ll forget about presents.” “They’ll assume you’re deeply, profoundly disturbed,” Molly countered, adding a tiny, furious scribble to her drawing. “And that’s not exactly helping our cover.” “It’s *adding* to the cover,” Bill insisted, carefully readjusting the duck. "Think of it as theatrical flair. Like a…a very confused clown.” “We're not clowns, Bill. We're professional thieves.” "Technically," he conceded, “We are professionals who occasionally wear slightly ridiculous hats.” He paused. "So, about this party. You’re sure Mr. Henderson’s security is as lax as you’re claiming?" "Positive. He's obsessed with birthday parties. He’s practically built a fortress out of glitter and balloons. But that’s also his weakness. He’ll think we’re just…enthusiastic guests. He’ll be too busy elbowing kids for a better view of the bouncy castle to notice a small, strategically-deployed team relieving him of his children’s generosity.” “Generosity,” Bill repeated dryly. “You make it sound like a charity drive.” “It’s a heist, Bill. Let’s not get sentimental. Okay, three of us: you, me, and Finn. Finn’s on crowd control. He’s remarkably adept at distracting toddlers with elaborate stories about sentient gummy bears.” “Sentient gummy bears? Really, Molly?” “Don't question the logistics. Just execute the story. And keep the children occupied. Meanwhile, you and I will be dealing with the presents. I’ve mapped out the layout. The main gift table is in the corner, behind the cake. Guarded, naturally, by a particularly enthusiastic six-year-old named Kevin. He has a disconcerting fondness for sparkly things.” “Disconcerting is an understatement. I heard he once tried to trade a Lego for a lollipop.” “Then we need a distraction for *him*,” Molly said grimly. “Something involving a very large, inflatable dinosaur. Do you have a dinosaur?” Bill frowned, pushing the duck off his forehead with a disappointed sigh. "I have a rubber duck. It’s…marginally less terrifying.” “Just think, Bill! It’s a start.”
created_at
"2026-02-20T19:31:05.96Z"
2/20/2026, 7:31:05 PM
model
"google/gemma-3-4b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
167
moderation_latency
(null)
generation_time
6953
tokens_prompt
93
tokens_completion
727
native_tokens_prompt
75
native_tokens_completion
636
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00005388
router
(null)
provider_responses
0
endpoint_id
"d3de3bd4-81bc-48fb-924f-2a87b2a36e75"
id
"chatcmpl-RP4KvU5qRVAd7Ger4qvG48DT"
is_byok
false
latency
61
model_permaslug
"google/gemma-3-4b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771615865-ANO8PsAEyG4ZCoi503ep"
upstream_id
"chatcmpl-RP4KvU5qRVAd7Ger4qvG48DT"
total_cost
0.00005388
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
399
0.00%
Dialogue to Total Word Ratio
Ratio: 74.69%, Deviation: 44.69%
neededClean
false
wordsTotal
399
wordsDialogue
298
0.0000%