NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8ZW9QQKD64VMCGD1MW0M2Z
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 60s
Cost:
$0.001570
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.13%
Dialogue to Total Word Ratio
(details)
0.0648%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Check your gear, Bill," Molly hissed, her eyes narrowing as she pointed to a hand-drawn map of a suburban backyard. "This is a high-stakes extraction. One misplaced balloon, one rogue glitter bomb, and the whole operation is blown." Bill adjusted his tactical vest, which was actually a child’s fishing vest with far too many pockets. "I’m ready, Molly. I’ve got the juice boxes, the extra napkins, and my emergency whistle." "No whistles!" Molly snapped. "We are ghosts, Bill. Shadows in the garden. We enter through the rosebushes, bypass the inflatable obstacle course, and hit the gift table before the cake is even sliced." "What if there’s a bounce house involved?" Bill whispered, his eyes wide with genuine terror. "I’ve seen the physics in action, Molly. It’s chaos. A vortex of tiny, screaming limbs. We could lose a shoe in there." "We avoid the bounce house at all costs," Molly commanded. "The centrifugal force is too unpredictable. We move silently along the perimeter, avoiding the primary sentry." "You mean Buster?" Bill asked. "Exactly. The golden retriever is a loose cannon. He’s friendly, which makes him even more dangerous. He’ll lick you into a state of complacency." "Noted. Keep the dog at bay," Bill muttered. "What’s the projected haul? Give me the intel." "High-density plastic constructions, remote-controlled terrestrial vehicles, and potentially a handheld gaming console if the target is a tech-savvy seven-year-old," Molly said, her voice dropping to a conspiratorial low. "I heard Timmy is getting a motorized dinosaur," Bill added, a hint of awe in his voice. "That’s a heavy lift, Molly. We might need the collapsible wagon." "We don't need a wagon! We need speed!" Molly hissed. "The distraction begins at 14:00 hours. That’s when the magician arrives. While the crowd is mesmerized by the rabbit and the subpar card tricks, we move." "Do you think the magician is an undercover cop?" Bill asked, peering suspiciously through his binoculars. "Don't be absurd, Bill. No officer would wear a cape that sequined. It’s a tactical nightmare." "Fair point. But what about the sugar rush? If the cake hits the table, the kids might enter a state of hyper-activity. It’ll be like a riot." "That is our window of opportunity!" Molly whispered urgently. "The chaos of the singing is our smoke screen. We grab the boxes, we bag the loot, and we vanish before the piñata is even cracked." "And if the piñata is filled with high-value assets?" Bill wondered. "Then we stay for the candy, Bill. We stay for the candy." "You're a visionary, Molly." "I know. Now, grab the tactical googly eyes. We're going in."
created_at
"2026-05-10T13:05:29.345Z"
5/10/2026, 1:05:29 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
226
moderation_latency
(null)
generation_time
119409
tokens_prompt
77
tokens_completion
4892
native_tokens_prompt
82
native_tokens_completion
4600
native_tokens_completion_images
(null)
native_tokens_reasoning
4190
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00156974
router
(null)
provider_responses
0
endpoint_id
"d71c6981-347d-4622-a86b-2c9e5d0f2c6c"
id
"chatcmpl-Roow856WMlJVAIOZ2AOudAAB"
is_byok
false
latency
61
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778418329-GoBiM9kGFS1DV4hclE2W"
session_id
(null)
api_type
"completions"
id
"gen-1778418329-iYvfi2Ec7vTBdR0Mslp5"
upstream_id
"chatcmpl-Roow856WMlJVAIOZ2AOudAAB"
total_cost
0.00156974
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
435
0.13%
Dialogue to Total Word Ratio
Ratio: 81.53%, Deviation: 11.53%
neededClean
false
wordsTotal
444
wordsDialogue
362
0.0648%