NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR92P70CRJ4Q0073BMQ1MY0B
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4m 29s
Cost:
$0.002308
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
68.58%
Dialogue to Total Word Ratio
(details)
34.2904%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly shifted her weight, the worn linoleum of the basement floor creaking beneath her boots. The room smelled of damp concrete and old newspapers, a fitting headquarters for an operation of this magnitude. She slapped a hand-drawn blueprint of a split-level ranch onto the grease-stained table, the paper fluttering under the hum of a flickering fluorescent light. She used a thick red marker to circle the living room in an aggressive, jagged loop that looked more like a crime scene than a floor plan. Bill leaned in, squinting through his cracked glasses and chewing rhythmically on a piece of cold, butterless toast. He looked less like a professional criminal and more like a man who had accidentally wandered into a conspiracy theory meeting. "The target is the gift table, situated strategically between the snack station and the bounce house," Molly declared. Bill asked if there was a dog. Molly sighed, her patience thinning as she adjusted her tactical headset, which was not connected to anything. She explained that there was a golden retriever, docile unless he saw a sandwich, meaning a pocket full of premium deli ham would neutralize him. Bill nodded slowly, scribbling ham on a crumpled notepad with a pencil that was barely a nub. "And the security?" "Ten six-year-olds on a sugar high," Molly whispered, her voice trembling with an intensity usually reserved for diamond heists. "It is a chaotic environment, Bill. Absolute anarchy. We strike during the cake cutting. That is the window of maximum distraction." The blueprint was smudged with coffee stains, but the layout of the foyer was unmistakable. Molly had spent three days observing the target house from a parked sedan, noting the exact timing of the ice cream truck arrival. Bill looked at the map, then back at Mollys wide, manic eyes, wondering if the risks outweighed the rewards of plastic toys. "Are we really doing this for LEGO sets and remote-control cars?" "It is about the principle, Bill! And the resale value of limited-edition Star Wars kits is skyrocketing," she snapped, slamming her hand on the table. "Now, the disguises. I have secured two oversized clown suits. We blend in, we swoop, we exit through the sliding glass door." Bill frowned, imagining himself encased in neon polka dots and oversized shoes. He admitted he could not wear the red foam nose because it made him sneeze. "Suffer in silence for the craft," Molly commanded. She pointed to a handwritten timeline that looked like it had been drafted by a military general. "We enter at 2:00 PM. By 2:30, the children will be reaching peak hyperactivity. When the parents start arguing about who gets the first slice of chocolate fudge, we clear the table. We have exactly ninety seconds before the first child notices their present is missing and triggers a sonic scream." Bill expressed concern about being spotted by a child. Molly reached into her tactical bag and produced a handful of brightly colored, holographic stickers. "Bribes. No child can resist a shimmering dinosaur. It is the perfect distraction." Bill blinked, his expression a mixture of terror and genuine curiosity about the stickers. He looked at the blueprint, feeling a strange surge of professional pride. "I will handle the ham." Molly nodded, her eyes gleaming with a misplaced sense of grandeur. "Precision, Bill. This is the biggest score of the suburban season."
created_at
"2026-05-10T13:55:00.878Z"
5/10/2026, 1:55:00 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
216
moderation_latency
(null)
generation_time
243881
tokens_prompt
77
tokens_completion
6738
native_tokens_prompt
82
native_tokens_completion
6046
native_tokens_completion_images
(null)
native_tokens_reasoning
5868
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00230814
router
(null)
provider_responses
0
endpoint_id
"69279254-6a96-4e9d-abc6-ea76e6bf8e97"
is_byok
false
latency
53
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Novita"
status
429
1
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-R5lGCkmnNoR81obV7z9APeKB"
is_byok
false
latency
46
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778421300-53HSnt2q7isdmSeqqLrI"
session_id
(null)
api_type
"completions"
id
"gen-1778421300-TbkAjiq5SBLvfcVSufot"
upstream_id
"chatcmpl-R5lGCkmnNoR81obV7z9APeKB"
total_cost
0.00230814
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
561
68.58%
Dialogue to Total Word Ratio
Ratio: 32.75%, Deviation: 2.75%
neededClean
false
wordsTotal
568
wordsDialogue
186
34.2904%