NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8WDJ09EFF1JZARZ8QH2X89
Run Details
Model:
Gemma 4 31B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
31.5s
Cost:
$0.000248
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"I’m telling you, Bill, the perimeter is a nightmare," Molly said, slapping a crayon-drawn map onto the kitchen table. "It’s a backyard in Suburbia, Molly. What’s the perimeter? A picket fence?" "It’s a picket fence with a very aggressive Golden Retriever," she countered. "And the target zone is a bouncy castle. Do you have any idea how hard it is to maintain stealth while bouncing three feet in the air?" Bill squinted at the map. "Why are we stealing the presents? We're professionals. We do jewelry. We do bonds. Now we're doing Lego sets and glittery makeup kits?" "It’s about the volume, Bill! It’s a tenth birthday party for a sugar-crazed brat. There are at least twenty wrapped boxes. Some of those are high-end electronics. The resale value on an unboxed Nintendo Switch is a goldmine." "We're robbing children," Bill sighed. "I can handle the mob. I can handle the Interpol. But I can't handle the look on a seven-year-old's face when their dinosaur toy vanishes." "Focus. Now, the distraction. I’ll trigger the fire alarm in the kitchen, causing a mass exodus of parents and toddlers toward the patio. While they're arguing about whether the grill is actually on fire, you slip through the hedge." "And then what? I just walk out with twenty boxes? I don't have pockets for twenty boxes." "That’s why we have the tactical laundry basket." Bill stared at her. "The what?" "A laundry basket, Bill. It’s a domestic environment. Nobody questions a man carrying a laundry basket. You just scoop, dump, and dash." "I'm not carrying a laundry basket. It's undignified." "Do you want the Switch or do you want your dignity? Because you can't have both in a bouncy castle." "Fine. What about the cake? If we're already there, we might as well grab the cake." "Absolutely not. The cake is a variable. It’s fragile, it’s sticky, and it’ll leave a trail of frosting from the garden to the getaway car. We are ghosts, Bill. Frosting-free ghosts." "You're taking this way too seriously. It's a birthday party, not the Louvre." "The Louvre doesn't have a ball pit, Bill. The ball pit is a tactical liability. If you sink, I'm not coming in after you." "I can swim in plastic balls, Molly." "Just wear the disguise. And for heaven's sake, don't let the dog lick your face. It ruins the mood of the heist." "I hate this job," Bill muttered, reaching for the laundry basket. "I really, really hate this job."
created_at
"2026-05-10T12:05:00.562Z"
5/10/2026, 12:05:00 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
382
moderation_latency
(null)
generation_time
31410
tokens_prompt
77
tokens_completion
644
native_tokens_prompt
79
native_tokens_completion
625
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00024777
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-RlhtCF5cgWKmbToaoOteX8R2"
is_byok
false
latency
52
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778414700-MFoj4BC9doKz7yQgBMtH"
session_id
(null)
api_type
"completions"
id
"gen-1778414700-jxctM2GD7sw5mEmpkTKg"
upstream_id
"chatcmpl-RlhtCF5cgWKmbToaoOteX8R2"
total_cost
0.00024777
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
416
0.00%
Dialogue to Total Word Ratio
Ratio: 92.65%, Deviation: 22.65%
neededClean
false
wordsTotal
422
wordsDialogue
391
0.0000%