NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR91NVGR4JY9128AEEGM2BMX
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 48s
Cost:
$0.003546
Provider:
Cloudflare
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
82.39%
Dialogue to Total Word Ratio
(details)
91.1927%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the coloring book page across the stained wooden table. She tapped a purple crayon mark with surgical precision, her eyes narrowed in intense, professional concentration. The dim basement felt heavy with the scent of impending chaos. "This is the drop zone," Molly said. Bill squinted at the scribbled mess, leaning into the harsh, yellow light of the single, flickering bulb that hung precariously from the ceiling. "It looks like a toddler drew a mountain range." "That is the bouncy castle, Bill. Do not disrespect the target." "It looks inflatable and highly unstable," Bill noted, poking at a smudge of blue wax with a cautious, trembling finger. Molly ignored his hesitation and pointed to a pink smudge near the bottom of the page, where the crayon strokes were thick and chaotic. "The juice box station is heavily guarded by a group of high-energy toddlers. They move in packs. Do not make eye contact." "What if they offer me a gummy bear?" Bill asked, his voice trembling slightly. "That is a trap. They use sweets to lower your defenses." Bill shuddered, pulling his collar up against the imaginary chill of the looming heist, feeling the weight of their impending failure. "They are ruthless." Molly pulled a small, laminated photo of a remote-controlled helicopter from her tactical vest and slid it across the table like a forbidden treasure. "Our primary objective is the gift pile. Once the singing starts, we move in." "The singing is the most dangerous part," Bill whispered. "Exactly. The chaos of the Happy Birthday song provides the perfect acoustic cover for our exit." Bill adjusted his tactical goggles, which were actually just neon green swimming goggles that smelled faintly of chlorine and childhood. "What about the clown? I heard he carries a balloon animal contingency plan." "He is a distraction, Bill. We ignore the clown and focus on the Lego sets. The medium-sized Starship kits are our priority." "And the plushies?" "Secondary loot. Only if the weight limit allows." Bill looked down at his oversized backpack, which was decorated with bright, cartoonish stars and glittery moons that caught the light. "I feel like we should be wearing masks. Dark, intimidating masks." "We are wearing dinosaur onesies, Bill. It is the only way to blend in. A T-Rex is invisible in a room full of six-year-olds." "It is a bold strategy," Bill admitted. "It is the only strategy. Now, check your gear." Bill patted his many pockets with nervous, frantic energy, his fingers fumbling over the various stolen trinkets and juice boxes. "I have the juice boxes, the stickers, and the emergency slingshot." "And the getaway vehicle?" Molly demanded. "The tricycle is primed and ready for high-speed maneuvering. I added streamers for extra speed," Bill replied. Molly sighed, rubbing her temples as she stared at her partner with a look of profound, soul-crushing regret. "We are the most incompetent duo in the criminal underworld." "But we have the best snacks," Bill reminded her. "Just get in the van, Bill."
created_at
"2026-05-10T13:36:55.327Z"
5/10/2026, 1:36:55 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
161
moderation_latency
(null)
generation_time
108283
tokens_prompt
77
tokens_completion
11857
native_tokens_prompt
81
native_tokens_completion
11794
native_tokens_completion_images
(null)
native_tokens_reasoning
11072
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.0035463
router
(null)
provider_responses
0
endpoint_id
"e2a16fa6-1f65-413e-9ecb-93901089ddb1"
id
"id-1778420215476"
is_byok
false
latency
125
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"Cloudflare"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778420215-besrMjVH86HYq0kMYqYV"
session_id
(null)
api_type
"completions"
id
"gen-1778420215-HdV9Zgkkx8vnKGwLsHTg"
upstream_id
"id-1778420215476"
total_cost
0.0035463
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Cloudflare"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
501
82.39%
Dialogue to Total Word Ratio
Ratio: 51.97%, Deviation: 1.97%
neededClean
false
wordsTotal
508
wordsDialogue
264
91.1927%