NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR93GTE9FRDRNB74PEPWH348
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
3m 50s
Cost:
$0.003134
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
70.73%
Dialogue to Total Word Ratio
(details)
65.5040%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The basement smelled of stale coffee and unearned confidence. The dim light of a single flickering bulb cast long, eerie shadows across the room, making the colorful party invitation look like a map to a forbidden kingdom. Molly spread the glitter-encrusted invitation across the scarred wooden table like it was a top-secret intelligence dossier. She tapped a crayon-drawn map of the backyard with a trembling finger, her expression grim. "Alright, listen up," Molly whispered, her eyes narrowing into slits. "The target is Timmy’s seventh birthday extravaganza. We move in at precisely fourteen-hundred hours, immediately following the arrival of the clown." Bill leaned forward, squinting through his thick, fogged-up glasses. He poked a sticker of a smiling dinosaur on the edge of the paper. "Is the dinosaur a signal?" Bill asked, his voice trembling. "Is it a warning from the host? Are they watching us right now? I am not prepared for the emotional toll of a bouncy castle." Molly slammed her hand on the table, making the lukewarm juice boxes rattle. "It’s a sticker, Bill! Focus! We have a narrow window of opportunity. The moment the piñata is breached, the sensory overload will provide the perfect diversion. That is when we strike." Bill wiped sweat from his forehead with a crumpled napkin. He stared intensely at the tactical diagram of the living room, which was mostly just a drawing of a giant chocolate cake. "What about the security, Molly? I heard rumors about a high-intensity inflatable obstacle course in the backyard." Molly paced the small, cramped room, her shadow dancing wildly against the damp concrete walls. "The bouncy castle is a variable we haven't fully accounted for. It’s unpredictable. It’s bouncy. But we’ll use the distraction of the Baby Shark song to mask our movements." Bill shuddered, his face turning a pale shade of beige. "That song... it's psychological warfare. They’ve prepared for us, haven't they? This is a setup." Molly stopped pacing and gripped the edge of the table, leaning into his personal space until their noses almost touched. "They aren't prepared for professionals. We’re going for the high-value assets. The LEGO Star Wars sets, the remote-controlled drones, the limited-edition slime kits. We bag them, we move through the kitchen, and we exit via the sliding glass door." Bill nodded solemnly, though his eyes drifted toward a stray Cheeto on the floor. "And the cake? Do we take the cake?" Molly glared at him with a look that could melt lead. "We are thieves, Bill, not gluttons. We take the presents. The cake is too much heat. We leave the sprinkles for the locals." Bill sighed, defeated by her terrifying pragmatism. He grabbed a blue crayon and marked a heavy X over the drawing of a clown. He gripped the crayon like a weapon. "Fine. But if a toddler grabs my ankle, I’m calling an emergency extraction." Molly nodded, satisfied. The heist was on.
created_at
"2026-05-10T14:09:07.538Z"
5/10/2026, 2:09:07 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
142
moderation_latency
(null)
generation_time
229438
tokens_prompt
77
tokens_completion
9640
native_tokens_prompt
82
native_tokens_completion
9202
native_tokens_completion_images
(null)
native_tokens_reasoning
8866
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00313442
router
(null)
provider_responses
0
endpoint_id
"d71c6981-347d-4622-a86b-2c9e5d0f2c6c"
id
"chatcmpl-Rdrtsv6ropzN0S9cQipyB0kb"
is_byok
false
latency
49
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778422147-zfUnvL5Fx5Oj6dkHvpvF"
session_id
(null)
api_type
"completions"
id
"gen-1778422147-ZPRyQLnjztOhfIhHfGbA"
upstream_id
"chatcmpl-Rdrtsv6ropzN0S9cQipyB0kb"
total_cost
0.00313442
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
70.73%
Dialogue to Total Word Ratio
Ratio: 47.37%, Deviation: 2.63%
neededClean
false
wordsTotal
494
wordsDialogue
234
65.5040%