NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR9026PHDFPQZV3BV58F62S7
Run Details
Model:
Gemma 4 26B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
21.3s
Cost:
$0.000217
Provider:
DekaLLM
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the sticky picnic table, her eyes gleaming with a predatory intensity usually reserved for buffet lines. "Alright, Bill, look at the perimeter. It’s a fortress of inflatable bounce houses and glitter," Molly whispered, tapping a diagram of a suburban backyard. "Is that a tactical map or a coloring book?" Bill asked, squinting through his bifocals. "Why are there sprinkles on the north quadrant?" "Those are sensory details, Bill! Focus. The target is the gift pile located in the center of the sandbox. It’s a high-density zone of Lego sets, remote-controlled drones, and potentially, high-end handheld consoles." "And our entry point?" "We go through the clown. He’s the weakest link. His distraction technique involves balloon animals, which leaves his flanks completely exposed." "The clown? Molly, that’s psychological warfare. I’m not trained for clown combat. What if he has a squeaky nose? It could trigger my fight-or-flight response." "You’ll wear the oversized polka-dot bowtie. It’s your disguise. You’re the clumsy uncle who tripped into the punch bowl." "I can do clumsy. I was born clumsy. But what about the sugar rush? The biological volatility of twenty six-year-olds is off the charts. It’s a chaotic variable we haven't accounted for." "That’s where I come in. I’ll deploy the distraction: a sudden, unexplained shortage of juice boxes. While they’re mourning the loss of their fruit punch, you sweep the presents into the oversized diaper bag." "A diaper bag? That’s a bit insulting to our professional reputations, don't you think?" "It’s inconspicuous, Bill! No one suspects a man carrying a floral-print diaper bag. It’s the ultimate Trojan horse." "Fine, fine. But what if the birthday boy sees us? He’s got those sharp, piercing eyes. Kids see everything. They’re like tiny, unblinking detectives." "If he approaches, you use the 'distressed toddler' maneuver. Sit on the floor, look confused, and pretend you’re searching for a lost marble. It’ll buy us ninety seconds." "Ninety seconds? To move thirty-five wrapped boxes? We’ll need a team of specialists." "You are the specialist! You’re the best gift-grabber in the tri-state area!" "I prefer the term 'unsolicited toy redistributor,' but okay." "Now, practice your 'clumsy uncle' walk. I want to see hesitation. I want to see a man who has clearly had one too many cupcakes." Bill stood up, wobbled slightly, and performed a theatrical stumble that nearly sent him into a rosebush. "How was that? Did I look suspicious or just mildly incompetent?" "A little too much competence in the hip movement. Relax the pelvis, Bill. You’re a man defeated by frosting, not an Olympic gymnast." "This is the most ridiculous heist I’ve ever been a part of." "It’s the most lucrative, Bill. Now, grab the glitter-proof gloves. We move at two o’clock, right after the piñata explodes."
created_at
"2026-05-10T13:08:42.841Z"
5/10/2026, 1:08:42 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4917
moderation_latency
(null)
generation_time
21260
tokens_prompt
77
tokens_completion
779
native_tokens_prompt
68
native_tokens_completion
645
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00021693
router
(null)
provider_responses
0
endpoint_id
"d99bcbe6-bec3-4991-965e-45fea79fe63c"
id
"chatcmpl-142510984b38c391ee63d9b0603a78bf"
is_byok
false
latency
4917
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"DekaLLM"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778418522-EFwVw9Wsx6GHfSlnjtMn"
session_id
(null)
api_type
"completions"
id
"gen-1778418522-pIGo2VUUhVOFuChC6Jpa"
upstream_id
"chatcmpl-142510984b38c391ee63d9b0603a78bf"
total_cost
0.00021693
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DekaLLM"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
462
0.00%
Dialogue to Total Word Ratio
Ratio: 88.82%, Deviation: 18.82%
neededClean
false
wordsTotal
474
wordsDialogue
421
0.0000%