NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR9367GT9YABKASRQCYG4X07
Run Details
Model:
Gemma 4 31B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
24.9s
Cost:
$0.000238
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0001%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"The primary target is the mahogany sideboard," Molly said, stabbing a finger at the blueprint. "That’s where the high-value loot congregates." Bill squinted at the drawing. "Is this a map of a living room or a toddler’s finger painting?" "It’s a tactical schematic, Bill. Focus. We enter through the patio doors at 14:00 hours. That’s peak chaos. The cake will be served, the sugar rush will hit, and the parents will be in a state of total psychological collapse." "Why are we stealing from six-year-olds?" Bill asked. "I feel like there are easier marks. Banks, maybe? Museums?" "Because the security is nonexistent," Molly snapped. "The only guard is a golden retriever with a penchant for eating socks. Plus, have you seen the gift lists? There are three limited-edition robotic dinosaurs and a genuine gold-plated tea set. It’s a goldmine." "And the disguise?" "We’re going in as the entertainment," she said, pulling two oversized, neon-pink bunny costumes from a garment bag. Bill stared at the fluff. "Absolutely not." "It’s the perfect cover! Nobody looks at the face of a person in a six-foot rabbit suit. We blend in, we distract them with a few mediocre magic tricks, and then we execute the Great Present Pivot." "The what?" "While I perform the 'Disappearing Coin' routine, you scoop the presents into the oversized carrot bags and slide them out the side exit. We’ll be gone before the first tantrum peaks." "I can't move in this, Molly. I have a bad hip. I'll look like a malfunctioning marshmallow." "You’ll look like a professional, Bill! Just keep your head down and don't let any of the children pull your ears." "What if they suspect us?" "Suspect us? We’re giant pink bunnies! The only thing they’ll suspect is that the party is actually fun for once. Now, do you have the distraction pellets?" "The what?" "The glitter bombs, Bill! If the parents get too curious, we trigger the glitter. It creates a shimmering curtain of confusion. In the time it takes them to realize their carpet is ruined for a decade, we’ll be halfway to the getaway car." "I hate you," Bill sighed, stepping into the plush leg-hole. "I really do." "Save the emotion for the heist," Molly said, adjusting her bunny ears with a predatory glint in her eye. "Now, let's practice your 'hop.' It needs to be whimsical yet efficient." "I'm not hopping," Bill grumbled. "Hop, Bill! Hop for the robotic dinosaurs!"
created_at
"2026-05-10T14:03:20.482Z"
5/10/2026, 2:03:20 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
145
moderation_latency
(null)
generation_time
24846
tokens_prompt
77
tokens_completion
645
native_tokens_prompt
79
native_tokens_completion
600
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00023827
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-R14I3sMtgUIvoLF92HYFw9CK"
is_byok
false
latency
56
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778421800-gYHYW101Qkzy0VpneB8R"
session_id
(null)
api_type
"completions"
id
"gen-1778421800-9QXLA8rxZPwyBagUaM7D"
upstream_id
"chatcmpl-R14I3sMtgUIvoLF92HYFw9CK"
total_cost
0.00023827
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
408
0.00%
Dialogue to Total Word Ratio
Ratio: 86.06%, Deviation: 16.06%
neededClean
false
wordsTotal
416
wordsDialogue
358
0.0001%