NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9KYH99HBFXR296THD9JYK
Run Details
Model:
Gemma 3 4B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
8.4s
Cost:
$0.000056
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Right, so, Operation Sparklebomb. Let’s run it again,” Molly said, meticulously arranging miniature plastic dinosaurs on the coffee table. “We go in through the bouncy castle. It’s structurally unsound, apparently. Like, actively weeping plastic.” Bill, slumped in an armchair nursing a lukewarm coffee, grunted. “Weeping plastic? Seriously? You’re giving the bouncy castle a personality.” “It’s descriptive! Look, it’s the least conspicuous entry point. Mrs. Higgins will be distracted by a particularly shrill rendition of ‘Happy Birthday’ and the seven-year-olds will be too busy attempting to eat glitter glue to notice two slightly disheveled adults crawling through a giant inflatable reptile.” “Glitter glue? That’s… admirable,” Bill admitted, taking a slow sip of coffee. “So, we disable the cake? I’m thinking a well-placed puff of dry ice.” “Absolutely not. Dry ice causes panic. Think of the sugar rush! We’re after the *presents*, Bill, not the sensory overload. The cake is a distraction. A very sugary, buttercream distraction.” “Okay, okay. Presents. Right. And the guards?” “That’s Percy. He's obsessed with bubble wrap. We leverage that. I’ve rigged a miniature sonar device to emit a high-pitched whine directly at his bubble wrap collection. He’ll be utterly incapacitated.” “You’re… engineering a bubble wrap induced coma?” Bill raised an eyebrow. “That’s… impressively niche.” “It’s efficient! While he’s busy battling the foam, we systematically dismantle the gift table. We’re not going for the *expensive* presents, you know. Mrs. Higgins specifically requested ‘thoughtful’ gifts. Which, in her world, means things that vaguely resemble bath toys.” “So, rubber duckies and plastic shovels?” “Precisely. We need enough to fill a small van. Think about it, Bill, a van full of vaguely unsettling bath toys. Priceless.” “Worth the risk of a bubble wrap coma?” “Worth the risk of anything, darling. Imagine the headlines: ‘Local Thieves Nab Entire Stock of ‘Thoughtful’ Presents.’ It’s delightfully absurd.” Molly clicked her tongue. “Did you bring the grappling hook?” “Of course. It’s camouflage colored. Looks like a particularly sad-looking garden gnome.” “Perfect. And the disguises? I went with a nun. Very effective. Lots of pitying glances.” “A nun? You? Molly, you’re practically radiating neon energy.” “Don’t judge. It throws people off. Besides, who suspects a nun of grand larceny?” “Someone who’s not looking for a vaguely shaped rubber duck, apparently. Okay, last question. What if a child asks us where we're going?" Molly paused, considering. “We tell them we’re collecting donations for… underprivileged stuffed animals.” Bill stared at her for a long moment. “Seriously?” “Seriously. Now, let’s talk about the deployment of the glitter cannons…”
created_at
"2026-02-20T19:49:51.797Z"
2/20/2026, 7:49:51 PM
model
"google/gemma-3-4b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
220
moderation_latency
(null)
generation_time
7766
tokens_prompt
93
tokens_completion
796
native_tokens_prompt
75
native_tokens_completion
664
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00005612
router
(null)
provider_responses
0
endpoint_id
"d3de3bd4-81bc-48fb-924f-2a87b2a36e75"
id
"chatcmpl-R2HcUUTcQPuGTajXVymMG15J"
is_byok
false
latency
103
model_permaslug
"google/gemma-3-4b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771616991-i0rJPvebMk1Y1E4vJlI5"
upstream_id
"chatcmpl-R2HcUUTcQPuGTajXVymMG15J"
total_cost
0.00005612
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
424
0.00%
Dialogue to Total Word Ratio
Ratio: 88.68%, Deviation: 38.68%
neededClean
false
wordsTotal
424
wordsDialogue
376
0.0000%