NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8XWXBTKJW169SRDA47A1QT
Run Details
Model:
Gemma 4 26B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
10.5s
Cost:
$0.000236
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the grease-stained table. "It’s a fortress, Bill. A sugar-coated, bouncy-castle-fortified fortress." Bill squinted at the colorful diagram. "Is that a map of the sandbox? Why are we studying the sandbox?" "Because that’s where the perimeter sensors are located," Molly whispered, tapping a drawing of a plastic bucket. "They call them 'Sand-tastrophe Traps.' If you step in the wrong dune, the whistle goes off." "It’s a six-year-old’s birthday party, Molly. Not the Louvre." "That is exactly what the target wants you to think. They’ve got high-yield juice boxes and a clown with questionable facial hair guarding the main entrance." "The clown is a civilian," Bill argued, leaning back. "We go in through the kitchen. We grab the loot, bypass the cake, and exit through the laundry room." "The laundry room is a dead end. It’s blocked by a mountain of soiled toddler leggings and a very suspicious diaper pail." "We’ll climb the counter." "And risk a collision with the mother? She’s a heavy sleeper, but she’s got the reflexes of a startled cat." "What’s the payload again?" Bill asked, rubbing his temples. "Everything. The Lego sets, the remote-controlled cars, the glittery unicorn stables. We’re talking five hundred dollars worth of plastic junk, Bill. We’ll be kings of the flea market." "Is it worth the risk of being interrogated by a kid named Timmy?" "Timmy is our greatest threat. He’s a witness. He has eyes like a hawk and a memory like an elephant." "I’m not getting tackled by a kid in dinosaur pajamas." "Then keep your head down. You’re the heavy. You carry the gift bags. I’ll handle the diversion." "A diversion? What, are you going to fake a choking fit on a gummy bear?" "Better. I’ve acquired a professional-grade bubble machine. I’ll trigger the fog screen. In the confusion, while they’re all distracted by the iridescent spheres, we strike." "Bubbles? You want to pull a high-stakes heist using soap suds?" "It’s tactical, Bill. It provides visual obfuscation." "It provides a slip hazard. I’ll be sliding around like a penguin on ice." "Precisely. It’s the perfect cover." "And the exit strategy?" "The bouncy castle. We launch ourselves over the fence. It’s a controlled jump." "That’s not a jump, Molly. That’s a trajectory toward certain injury." "It’s high-risk, high-reward." "The reward is a bag of half-broken action figures and a used bouncy ball." "It’s the principle of the thing, Bill!" "The principle is madness. Fine. But if the clown catches us, I’m telling him it was your idea." "Deal. Now, grab the tactical juice boxes. We move at nap time."
created_at
"2026-05-10T12:30:52.29Z"
5/10/2026, 12:30:52 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
200
moderation_latency
(null)
generation_time
10364
tokens_prompt
77
tokens_completion
726
native_tokens_prompt
79
native_tokens_completion
679
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00023639
router
(null)
provider_responses
0
endpoint_id
"d71c6981-347d-4622-a86b-2c9e5d0f2c6c"
id
"chatcmpl-RJnFWXGs4ztJviZvclMnTt4G"
is_byok
false
latency
46
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778416252-NvXYyYbmIKp368vLVxBh"
session_id
(null)
api_type
"completions"
id
"gen-1778416252-NSS8g6U5XXEHOr8fR64P"
upstream_id
"chatcmpl-RJnFWXGs4ztJviZvclMnTt4G"
total_cost
0.00023639
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
433
0.00%
Dialogue to Total Word Ratio
Ratio: 92.62%, Deviation: 22.62%
neededClean
false
wordsTotal
447
wordsDialogue
414
0.0000%