NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR932NVPYGC97MGWGMX5Y3JP
Run Details
Model:
Gemma 4 26B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
59.8s
Cost:
$0.000253
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the sticky surface of the diner booth, her eyes narrowing as she traced the perimeter of the suburban bungalow. It was a tactical map, drawn in neon pink highlighter, featuring a detailed layout of the backyard bouncy castle. Bill squinted at the diagram, poking a sausage link toward the indicated location of the gift table. Is there really a perimeter alarm on a suburban patio, Molly? It looks more like a cluster of garden gnomes. The gnomes are a distraction, Bill, Molly hissed, leaning in so close her nose nearly touched the grease-stained paper. They are strategically placed to obscure the line of sight from the kitchen window. We move through the inflatable obstacle course. Bill leaned back, the vinyl seat squeaking in protest. The bouncy castle? You want us to perform a high-stakes heist via a series of rhythmic hops? I’m not built for gymnastics. I have a bad knee and a very low center of gravity. It is the only way to bypass the sensory range of the toddler security detail, she replied, her voice dropping to a dramatic whisper. Once we breach the bouncy castle, we hit the main hall. The presents are concentrated near the cake table. The cake table is a death trap, Bill countered, waving a fry for emphasis. There will be sugar-crazed five-year-olds running in erratic, unpredictable patterns. They are the most volatile element in any criminal enterprise. One spilled juice box and we are compromised. Molly tapped the map with a disciplined finger. We will utilize the distraction of the piñata. When the star of the show swings the stick, that is our window. We strike while the donkey is being beaten. Bill rubbed his temples, looking genuinely concerned for their professional reputation. We are professional thieves, Molly. We have cracked safes in Zurich. We have intercepted shipments in Macau. And now, you want me to wear a camouflage onesie and belly-crawl under a table full of Lego sets and stuffed bears. It is not a onesie, it is tactical fleece, she corrected him sharply. And we are not just taking any presents. We are targeting the high-value items. The remote-controlled drones, the motorized toy cars, the unboxed electronics. The loot is mostly plastic junk and half-eaten gummy worms, Bill muttered, though he was already reaching for a napkin to sketch out an escape route. What if the birthday boy sees us? He looks like a kid who doesn't let go of his toys. He is three, Bill. His attention span is shorter than my patience, Molly said, folding the map with military precision. We enter through the bouncy castle, neutralize the cake, and extract the goods before the singing of Happy Birthday begins. Fine, Bill sighed, sliding out of the booth. But if I get stuck in a slide, I am leaving you behind to face the wrath of the parents.
created_at
"2026-05-10T14:01:24.094Z"
5/10/2026, 2:01:24 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
7686
moderation_latency
(null)
generation_time
59778
tokens_prompt
77
tokens_completion
726
native_tokens_prompt
79
native_tokens_completion
607
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00025307
router
(null)
provider_responses
0
endpoint_id
"03b96766-ef7e-454f-9519-c76b70654b49"
id
"24849123e676709018136e20f9c7d106"
is_byok
false
latency
7536
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778421684-znwCPc2mckTpj6CSdiBy"
session_id
(null)
api_type
"completions"
id
"gen-1778421684-CS6ChCr0oYaATu7pnuCl"
upstream_id
"24849123e676709018136e20f9c7d106"
total_cost
0.00025307
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
494
wordsDialogue
0
30.1376%