NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8VWKH9KF52PGZF1CMR6CF6
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
10m 37s
Cost:
$0.000217
Provider:
DekaLLM
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
54.12%
Dialogue to Total Word Ratio
(details)
27.0592%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints were spread across a greasy diner table, illuminated by a flickering neon sign outside. Molly adjusted her tactical goggles, which were actually just cheap, neon-pink swimming goggles she had found in her junk drawer. She leaned in close, her expression grim. "Look at the layout, Bill. It's a fortress of fluff and frosting," Molly said, pointing a red laser pointer at a hand-drawn crayon sketch of a sandbox. Bill squinted through the red dot, leaning over his lukewarm coffee. "Is that a tactical minefield?" "It's a sandbox, Bill. Full of sand and, potentially, disgruntled toddlers." "Right. High-risk terrain. What's the perimeter look like?" Molly tapped the sketch with a gloved finger. "The perimeter is guarded by a bouncy castle. It's a nightmare for traction. We'll have to time our approach between the big jumps." "And if we get bounced into a group of six-year-olds?" "Then we use the distraction. The juice boxes." "The juice boxes? That's our primary diversion?" "They're high-fructose and highly effective. Once the sugar rush hits, the guards become unpredictable. They scream. They run in circles. That's when we move." Bill rubbed his temples, looking worried. "I don't like the screaming, Molly. It's bad for my nerves." "Focus! We're here for the heavy hitters. The LEGO Star Wars sets. The remote-controlled drones. Anything with a high resale value and low weight." "What about the stuffed animals? The giant teddy bears?" "Too much bulk. We're thieves, Bill, not professional movers. We need high density, low volume. Think battery-operated cars, not giant plushies." "Got it. High density, low volume. What about the clown?" "The clown is a wild card. If he starts twisting balloons into poodles, we abort the mission immediately." "Is the clown a direct threat?" "He's a psychological threat. The squeaking of the balloon is too much like a warning siren." "You're paranoid." "I'm prepared. Now, check your kit." Bill patted his various pockets, making a series of dull thuds. "I've got the juice boxes, the extra napkins, the emergency bubble wand, and a pack of dinosaur stickers." "The stickers? Why the stickers?" "In case we need to bribe a witness. A toddler is much more likely to cooperate if you offer them a T-Rex." Molly stared at him for a long moment before nodding. "Brilliant. Truly brilliant. Use them sparingly." "And the napkins?" "For the inevitable frosting spills. We can't leave fingerprints in the cake, Bill. We're professionals." "Right. No fingerprints. No glitter. No eye contact with the clown." "Exactly. This is the big one. The preschool circuit is brutal, but this party... this is the heist of a lifetime." Bill raised his coffee cup in a silent toast. "To the presents." "To the presents," Molly echoed, her eyes gleaming behind the pink plastic.
created_at
"2026-05-10T11:55:45.073Z"
5/10/2026, 11:55:45 AM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2906
moderation_latency
(null)
generation_time
636713
tokens_prompt
77
tokens_completion
7320
native_tokens_prompt
68
native_tokens_completion
644
native_tokens_completion_images
(null)
native_tokens_reasoning
6000
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.0002166
router
(null)
provider_responses
0
endpoint_id
"d99bcbe6-bec3-4991-965e-45fea79fe63c"
id
"chatcmpl-9cea3d76bd87dcde"
is_byok
false
latency
2906
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"DekaLLM"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778414145-7M7tEZ6xeTKX5HvCh1qi"
session_id
(null)
api_type
"completions"
id
"gen-1778414145-DTIhyFdBXETMflQHNPkW"
upstream_id
"chatcmpl-9cea3d76bd87dcde"
total_cost
0.0002166
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DekaLLM"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
460
54.12%
Dialogue to Total Word Ratio
Ratio: 73.50%, Deviation: 3.50%
neededClean
false
wordsTotal
468
wordsDialogue
344
27.0592%