NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8WWZ704QZJ8329M9WP780N
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
2m 13s
Cost:
$0.003634
Provider:
Cloudflare
Evaluation summary
Score
Evaluator
Details Link
20.19%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
10.0948%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The dim light of the single overhead bulb flickered over the crime scene map, which was actually a heavily annotated coloring page of a castle. Molly tapped a blunt red crayon against the drawing, her expression grim and her eyes narrowed with professional intensity. She looked like a battle-hardened general planning a siege rather than a woman planning to steal a Lego set. Bill sat opposite her, picking at the crinkled wrapper of a gummy worm with trembling fingers. The air in the basement was stale. Listen closely, Bill. This is the most delicate operation of our careers, Molly whispered. It is a seven-year-old’s birthday party, Molly. Not the Louvre, Bill replied. That is exactly why the security is so unpredictable. We are dealing with unregulated sugar-fueled chaos. I do not like it. The target is a kid in a dinosaur costume. Molly ignored his whining and pointed a calloused finger at a crude drawing of a bouncy castle. The lines were shaky and drawn in pink crayon, but to her, it was a tactical masterpiece. The castle looked more like a marshmallow. The bouncy castle is our extraction point. We enter through the side gate, bypass the juice box station, and head straight for the gift pile. What about the perimeter guard? The one with the spatula? That is just Timmy’s dad. He will be occupied with the hot dogs. Our window is between the singing and the piñata. Bill wiped a bead of sweat from his lip with a napkin that featured a smiling cupcake. He adjusted his black tactical vest, which was actually a fishing vest stuffed with snacks and gadgets. The tension was palpable. And if the dog bites me? The golden retriever? He looks like a heavy hitter. Buster is a distraction. Use the squeaky toy. It is a high-frequency lure. A rubber chicken? We are using a rubber chicken to neutralize a dog? It is a proven tactic. Do not question it. I am just saying, if we get caught, I am telling the cops I was just here for the cake. Molly slammed her hand onto the table, making the gummy worms jump. She leaned forward, her shadow looming large against the damp basement wall. She looked dangerous. There will be no cake! The cake is a secondary objective. We are here for the high-value electronics and the premium building blocks. Fine, fine. High-value blocks. Got it. Bill reached into his bag and pulled out neon-colored sunglasses. He put them on with a flourish, looking entirely ridiculous in the dim light. He squinted through the lenses. Do these help with the stealth aspect? They make you look like a disgruntled teenager. Now, check your equipment. I have the empty bags, the silent socks, and the emergency glitter bomb. The glitter bomb is for the exit only. We cannot be tracked by residue. Understood. No glitter until the getaway. Molly stood up, her face a mask of professional determination. She grabbed her gear and prepared for the inevitable descent into suburban madness. Operation Rainbow Sparkle is a go. God help us all, Bill muttered.
created_at
"2026-05-10T12:13:25.609Z"
5/10/2026, 12:13:25 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
268
moderation_latency
(null)
generation_time
132883
tokens_prompt
77
tokens_completion
12139
native_tokens_prompt
81
native_tokens_completion
12086
native_tokens_completion_images
(null)
native_tokens_reasoning
11344
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.0036339
router
(null)
provider_responses
0
endpoint_id
"e2a16fa6-1f65-413e-9ecb-93901089ddb1"
id
"id-1778415205767"
is_byok
false
latency
165
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"Cloudflare"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778415205-YIiZwB5npiE4bwCYagVH"
session_id
(null)
api_type
"completions"
id
"gen-1778415205-qZKhNgp6CsQda3TEaomW"
upstream_id
"id-1778415205767"
total_cost
0.0036339
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Cloudflare"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
20.19%
Matches word count
n/a
neededClean
false
words
520
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
528
wordsDialogue
0
10.0948%