NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8ZYCBHMNX5EZH02K84WBRH
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 1s
Cost:
$0.001553
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly checked her watch and sighed. "Focus, Bill. The target is a seven-year-old's birthday bash at the Community Center." "Why a seven-year-old? Why not an eight-year-old? They get better LEGO sets," Bill replied, chewing on a toothpick. "Because this kid is the son of a venture capitalist. The presents aren't just blocks, Bill. We're talking high-end gadgets, custom-made drones, maybe even a miniature electric car." "I want a miniature electric car." "You're not keeping the car. We sell the loot." "Right. Right. Now, about the disguises. I'm not wearing the clown suit again." "The clown suit worked perfectly in the Topeka job," Molly said, ignoring his pout. "I had glitter in my pores for three months, Molly. Three months! I looked like a disco ball every time I sweated." "Suck it up. You're the distraction. You go in, you do some mediocre balloon animals, you create a chaotic diversion, and I slip into the gift room." "What if I'm actually good at the balloons?" "Don't be good. Be mediocre. If you're too good, they'll keep you there. If you're terrible, they'll kick you out. Mediocrity is the key to invisibility." "I can do mediocre. I'm a natural at mediocre." "Exactly. Now, the timeline. The cake is served at 3:00 PM. That's when the children are most distracted by sugar and greed." "Do I get a piece of cake?" "No!" "Just a small sliver? I'm fasting." "You ate a whole bag of chips five minutes ago." "That was a snack. Cake is a meal." "Listen to me. At 3:05, you trigger the Accidental Balloon Pop Explosion. I enter through the side vent, bag the presents, and exit via the laundry chute." "Is there a laundry chute?" "I checked the blueprints." "You got blueprints for a community center?" "I have my ways." "I just feel like we're underestimating the toddlers. They're feral, Molly. Pure chaos. I once saw a three-year-old take down a golden retriever." "That's why we're using them as cover. Who suspects two adults at a party where everyone is screaming? It's the perfect sonic camouflage." "True. I'll just make a really sad-looking dog out of a long balloon. Maybe a pigeon." "Perfect. Now put on the wig." "It's itching. Is this synthetic?" "It's a costume wig, Bill. Now put it on and try to look like a professional entertainer who is having a mid-life crisis." "I can do that. I've been practicing in the mirror." "Wait," Bill said, pausing at the door. "What if there's a magician?" "Then you out-mediocre the magician," Molly snapped. "And don't forget the bubble machine," Molly added. "Is the bubble machine part of the distraction or just for fun?" "It's for the distraction, Bill. Now get out there before the first guest arrives."
created_at
"2026-05-10T13:06:37.56Z"
5/10/2026, 1:06:37 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
434
moderation_latency
(null)
generation_time
120909
tokens_prompt
77
tokens_completion
3841
native_tokens_prompt
82
native_tokens_completion
4059
native_tokens_completion_images
(null)
native_tokens_reasoning
3112
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00155308
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-RHlGkqDJIr84vl0nB3LEbjay"
is_byok
false
latency
41
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778418397-Ra1dMCUKnpCxe6KBVfWN"
session_id
(null)
api_type
"completions"
id
"gen-1778418397-9SLES0uCxQISqpLulq9O"
upstream_id
"chatcmpl-RHlGkqDJIr84vl0nB3LEbjay"
total_cost
0.00155308
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
459
0.00%
Dialogue to Total Word Ratio
Ratio: 94.27%, Deviation: 24.27%
neededClean
false
wordsTotal
471
wordsDialogue
444
0.0000%