NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8ZDHDBRT5ZFRC66HBCCH32
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 35s
Cost:
$0.002432
Provider:
Cloudflare
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
75.18%
Dialogue to Total Word Ratio
(details)
59.2791%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slammed her finger onto the center of the crumpled construction paper, treating the glitter-covered surface with the solemnity of a general studying a topographical map of the Alps. The map was a chaotic mess of crayon-drawn stick figures and smeared primary colors, but in the dim, flickering light of their basement hideout, it served as a perfect blueprint for the upcoming operation. She adjusted her tactical goggles—a pair of neon-pink swimming lenses she had scavenged from a garage sale—and stared intensely at the target. "The perimeter is secured by a bouncy castle, Bill," Molly whispered, her voice dropping to a conspiratorial rasp. "If we hit the inflatable, we risk a bounce-induced vertigo that could compromise our entire extraction." Bill leaned over the table, squinting at a drawing of a lopsided cake. He looked less like a master infiltrator and more like a man who had lost a fight with a craft store’s clearance bin. His utility belt, a repurposed gardening tool belt, clattered with the sound of a half-eaten granola bar and a single, mismatched plastic dinosaur. He suggested that the bouncy castle might actually be a tactical advantage, a potential platform for a high-altitude drop into the gift pile, but Molly was having none of it. "It’s a death trap!" she hissed, her eyes narrowing behind the pink plastic. "One wrong step and you're flailing like a landed trout while a six-year-old watches in judgment. We go through the sliding glass door. Silent. Stealthy. Like ninjas in high-performance polyester." Bill shifted uncomfortably, his eyes darting toward the corner of the room as he asked about the danger posed by the Beast. When Molly suggested he meant the family Golden Retriever, Bill corrected her with a trembling voice, warning her about the true terror of the party: the toddler in the dinosaur pajamas. "We wait for the sugar crash," Molly said, tapping a diagram of a nap mat. "Once the parents are distracted by the cake, we bag the Lego sets and vanish before the singing begins." "Is the piñata part of the score?" Bill asked, his eyes brightening at the prospect of high-quality hard candy. "The piñata is a diversion, Bill," Molly replied, her voice dripping with professional disdain. "It is a chaotic variable designed to draw eyes away from the real loot. We are here for the high-end electronics and the building blocks." Bill saluted with a hand still sticky from a previous snack. Molly rolled up the construction paper with heavy, dramatic gravity. She checked her watch, a cardboard sundial she had taped to her wrist for the aesthetic, and stood up. "Check your gear. If we are going to pull off the heist of the decade, we cannot be defeated by a layer of spilled apple juice." "I am ready," Bill said, his focus already drifting toward a nearby bowl of pretzels. "Let's get those presents."
created_at
"2026-05-10T12:57:25.682Z"
5/10/2026, 12:57:25 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
302
moderation_latency
(null)
generation_time
94550
tokens_prompt
77
tokens_completion
8638
native_tokens_prompt
81
native_tokens_completion
8081
native_tokens_completion_images
(null)
native_tokens_reasoning
7902
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.0024324
router
(null)
provider_responses
0
endpoint_id
"e2a16fa6-1f65-413e-9ecb-93901089ddb1"
id
"id-1778417845933"
is_byok
false
latency
256
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"Cloudflare"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778417845-SdYYmsuF6thOctzM56dt"
session_id
(null)
api_type
"completions"
id
"gen-1778417845-JA8FyLV15JszSSweIzRX"
upstream_id
"id-1778417845933"
total_cost
0.0024324
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Cloudflare"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
75.18%
Dialogue to Total Word Ratio
Ratio: 32.39%, Deviation: 2.39%
neededClean
false
wordsTotal
494
wordsDialogue
160
59.2791%