NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR90G3DM5FSBHVT96C069MVQ
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 45s
Cost:
$0.002498
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
91.85%
Dialogue to Total Word Ratio
(details)
45.9226%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slammed a crumpled piece of construction paper onto the grease-stained table. The neon colors screamed of suburban innocence, but her eyes were cold and calculated. The basement was dim, lit only by a single flickering bulb that cast long, dramatic shadows over their makeshift command center. "Listen up, Bill. This is the big one. Operation: Sugar Rush." Bill squinted at the crayon-drawn map, his hands trembling slightly as he adjusted his glasses. "Is that a dinosaur on the perimeter? Because I am telling you, Molly, I do not do reptiles. It is a psychological barrier." "It is a balloon arch, you idiot! Focus!" Molly tapped a thick smudge of pink wax with a sharpened pencil. "The target is a high-security residential zone. High density of sugar, low density of adult supervision. We hit the gift table at fourteen hundred hours, right when the piñata distraction begins." "And the extraction?" Bill asked, leaning into the light. "Do we use the getaway tricycle?" "We use the van, Bill! The nondescript white van!" Molly hissed, her voice dropping to a conspiratorial shadow. "We enter through the side gate, navigate the glitter-covered floor, and head straight for the loot. We are talking Legos, remote-controlled drones, the high-end sensory toys. Everything must be bagged and tagged." "What about the security?" Bill’s voice trembled. "I heard the birthday boy’s mother is a seasoned veteran in toddler-wrangling. She has eyes like a hawk and a grip like a vice." "We will use the accidental spill maneuver," Molly replied, her eyes gleaming with tactical intensity. "We drop a bag of pretzels, create a chaos window, and slip into the shadows. We will be ghosts, Bill. Silent, invisible ghosts." "And the clown?" Bill whispered, his eyes widening in genuine terror. "The clown is a wildcard, Molly. You cannot predict a clown. They are agents of pure chaos. One minute he is making a poodle, the next he is blowing a horn in your ear!" "The clown is our primary cover," Molly insisted, stabbing the map. "While he is performing his amateur balloon gymnastics, we move. We stay low, we stay quiet, and we avoid the bouncy castle at all costs. The physics of that inflatable terrain are too unpredictable for a stealth approach." "Fine," Bill sighed, rubbing his chin nervously. "But if I see a Labradoodle, I am out. Those things are trained in scent detection. They smell fear and frosting." "Just stick to the plan," Molly commanded. "No snacking on the cupcakes until we are in the vehicle. We are professionals, Bill. Not party guests." "Copy that," Bill said, saluting with a half-eaten pizza crust. "Operation: Sugar Rush is a go. But I am serious about the dinosaur nuggets."
created_at
"2026-05-10T13:16:18.236Z"
5/10/2026, 1:16:18 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
363
moderation_latency
(null)
generation_time
104397
tokens_prompt
77
tokens_completion
6555
native_tokens_prompt
82
native_tokens_completion
6218
native_tokens_completion_images
(null)
native_tokens_reasoning
5857
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00249786
router
(null)
provider_responses
0
endpoint_id
"03b96766-ef7e-454f-9519-c76b70654b49"
id
"87a1bc387824a8ebae141a6b3fda1abb"
is_byok
false
latency
319
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778418978-Vsi3my2forDYEfDfO5Fx"
session_id
(null)
api_type
"completions"
id
"gen-1778418978-Sbf2K82IYsSYcKWMi2Tx"
upstream_id
"87a1bc387824a8ebae141a6b3fda1abb"
total_cost
0.00249786
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
452
91.85%
Dialogue to Total Word Ratio
Ratio: 71.30%, Deviation: 1.30%
neededClean
false
wordsTotal
460
wordsDialogue
328
45.9226%