NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8WXJQ3FW0S5E9K3T9S2S88
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5m 38s
Cost:
$0.003798
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
89.20%
Dialogue to Total Word Ratio
(details)
94.5937%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the grease-stained blueprint across the scarred wooden table, her movements precise and practiced like a seasoned general preparing for war. This was not a high-security central bank vault or a professional diamond exchange featuring complex infrared motion sensors and reinforced steel plating. Instead, it was a detailed, hand-drawn floor plan of a modest, split-level ranch located in a quiet, leafy cul-de-sac. She tapped a small, bright red circle around the living room area with a tiny, silver laser pointer, her face illuminated by the dancing red dot. "The target is the pile under the weeping willow tree," Molly whispered, her eyes narrowing. "Is there a laser grid?" Bill asked, squinting at the crayon-drawn map. "No, Bill. Just a very sleepy golden retriever and a mother who hasn't had her coffee yet." "That's still high-level security," Bill muttered, reaching for a bag of gummy worms. "Focus! This is the big one. The haul could be massive." Bill leaned in much closer, squinting through his thick, heavy glasses while inspecting the tactical layout with extreme, almost painful intensity. He looked incredibly skeptical, his bushy brow furrowed deeply as he studied the perimeter of the backyard sandbox and the dangerous proximity of the inflatable slide. He reached for a plastic juice box, struggling unsuccessfully with the tiny, bent straw that refused to cooperate with his clumsy, trembling fingers. "What's the actual prize?" Bill asked. "The motherlode," Molly replied, her voice dropping an octave. "Limited edition LEGO sets, a remote-controlled drone, and at least three plushies that glow in the dark." "Sounds high stakes," Bill said, finally piercing the juice box with a loud, undignified slurp. "And the cake, Molly. We can't forget the cake." Molly paced the cramped, dimly lit kitchen, her black tactical vest clinking with an absurd amount of unnecessary equipment. She carried a heavy-duty flashlight, a professional lockpick set, a pair of night-vision goggles, and a very small pair of sewing scissors specifically intended for cutting curly ribbons. Bill was currently attempting to blend into the shadows, which mostly involved him leaning awkwardly against a humming refrigerator while trying to look like a professional operative. "We enter through the sliding glass door," Molly commanded, pointing to the entry point. "We move fast. No stopping for cake. No stopping for the bouncy castle." "What if there's a magician?" Bill inquired. "A magician is a wild card, Bill. We can't account for rabbit-related distractions. Stay focused." "I'm just saying, a man in a cape is a formidable obstacle," Bill argued. "We need a clean getaway," Molly added. "And I want a mask," Bill said. She slammed her hand on the table, rattling the empty juice boxes and the scattered gummy worms. The heavy gravity of the mission hung in the air, thick and suffocating, despite the faint, sweet smell of strawberry frosting drifting through the window from the neighbor's yard. They were not just common, low-rent thieves; they were specialists in high-velocity gift acquisition, and today, they were going for the gold.
created_at
"2026-05-10T12:13:45.578Z"
5/10/2026, 12:13:45 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
627
moderation_latency
(null)
generation_time
337754
tokens_prompt
77
tokens_completion
11195
native_tokens_prompt
82
native_tokens_completion
11153
native_tokens_completion_images
(null)
native_tokens_reasoning
10385
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00379776
router
(null)
provider_responses
0
endpoint_id
"d71c6981-347d-4622-a86b-2c9e5d0f2c6c"
id
"chatcmpl-RztJ78Bz6Uf9ggXI522DO6Ga"
is_byok
false
latency
507
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778415225-QzwWCJBL9WZqTQ2XemBz"
session_id
(null)
api_type
"completions"
id
"gen-1778415225-BhkcBXGFP1xWjY17fEKP"
upstream_id
"chatcmpl-RztJ78Bz6Uf9ggXI522DO6Ga"
total_cost
0.00379776
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
89.20%
Dialogue to Total Word Ratio
Ratio: 28.49%, Deviation: 1.51%
neededClean
false
wordsTotal
516
wordsDialogue
147
94.5937%