NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYAGMYAZ9MMVCAEVM3BFD2G
Run Details
Model:
Gemma 3 4B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
7.9s
Cost:
$0.000058
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The air in Bill’s cramped apartment smelled aggressively of stale pizza and regret. Rain hammered against the grimy window overlooking a perpetually damp alley. Molly, a woman built like a disgruntled linebacker, was meticulously constructing a miniature grappling hook out of bobby pins, rubber bands, and a particularly stubborn paperclip. Bill, a man who looked permanently surprised by the state of the world, was pacing. “Are you *sure* about this, Molly? A children’s birthday party? Seriously?” “Absolutely,” she grunted, attaching the paperclip with excessive force. “Think of the possibilities, Bill. Unwrapped joy, the scent of frosting, the sheer, unadulterated desire for shiny things. It's a goldmine.” “It’s a room full of sugar-fueled five-year-olds and probably a bouncy castle. It’s chaos.” “Chaos is my specialty. Now, help me with this knot.” She pointed to the grappling hook, which, at this point, resembled a tangled bird’s nest. Bill sighed and knelt, gingerly examining the contraption. “It looks… precarious. Do you even know how high the windows are?” “Measured with the precision of a blindfolded hawk,” Molly declared, ignoring his question and furiously twisting the rubber bands. “Mr. Henderson’s son, Timmy, is turning six. He’s obsessed with Lego Millennium Falcons. Think of the resale value!” “The resale value of a stolen Lego Millennium Falcon? That’s your motivation?” “Look, Bill, we need a score. Something… substantial. This isn't petty theft. This is strategic acquisition.” “Strategic acquisition of party favors.” He stated the obvious. “What about security? I saw a sign outside – ‘Volunteers Required.’” “Volunteers? A distraction, obviously. Everyone's busy blowing up balloons and handing out cake. We slip in while they're wrestling with a deflated dinosaur.” “You’re basing our entire operation on the potential for inflatable reptile malfunctions?” “Don’t underestimate the impact of a rogue T-Rex, Bill. It’s a powerful motivator. Now, are you going to help me or are you going to keep questioning the strategic brilliance of my plan?” “Fine, fine. But if I get sticky with frosting, I’m blaming you.” He carefully began to untangle the paperclip, making a series of frustrated grunts. “Just keep your hands out of the decorative napkins. And for the love of Scarface, stop talking.” “Scarface? Seriously? We’re stealing presents for six-year-olds, not robbing a bank.” “It’s a code, Bill. A code. Focus. We need to synchronize our entry. I’m going in first to assess the loot distribution. You’ll follow, creating a diversion. I’m thinking a dramatic interpretive dance to the theme song of 'Paw Patrol.’” Bill stared. “You’re… you’re going to do a ‘Paw Patrol’ interpretive dance?” “Don’t judge. It’s effective. Just… execute the plan.” “I’m starting to think ‘petty theft’ might have been a better option.”
created_at
"2026-02-20T20:05:32.242Z"
2/20/2026, 8:05:32 PM
model
"google/gemma-3-4b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
91
moderation_latency
(null)
generation_time
6938
tokens_prompt
93
tokens_completion
810
native_tokens_prompt
75
native_tokens_completion
693
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00005844
router
(null)
provider_responses
0
endpoint_id
"d3de3bd4-81bc-48fb-924f-2a87b2a36e75"
id
"chatcmpl-ROkIbT04YTVjccIVHb7LgDbp"
is_byok
false
latency
54
model_permaslug
"google/gemma-3-4b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771617932-IGw3KqZTrwhWcp4ESKGQ"
upstream_id
"chatcmpl-ROkIbT04YTVjccIVHb7LgDbp"
total_cost
0.00005844
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
447
0.00%
Dialogue to Total Word Ratio
Ratio: 71.81%, Deviation: 41.81%
neededClean
false
wordsTotal
447
wordsDialogue
321
0.0000%