NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR905J6RT4SGVZQHJ1CP6MN2
Run Details
Model:
Gemma 4 31B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
44.3s
Cost:
$0.000245
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.49%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.2460%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints of the suburban living room across the kitchen table, pinning the corners down with half-eaten slices of cold pizza. She pointed a laser pointer at a small, carpeted area near the bay window. This is the kill zone, she whispered with an intensity usually reserved for nuclear launches. Bill squinted at the map, chewing on a toothpick. It is a bounce house, Molly. We are robbing a six-year-old’s birthday party, not the Louvre. The lack of professionalism is why you are still driving the getaway van, Molly snapped, sliding a detailed chart of the guest list toward him. We have a window of exactly twelve minutes between the cake cutting and the piñata disaster. That is when the parental supervision hits rock bottom. They will be too distracted by frosting-covered toddlers to notice two adults in oversized mascot costumes slipping through the patio doors. Bill looked at the costume bag on the floor. Why do I have to be the giant panda? I am allergic to faux fur. Because you have the girth for it, she countered, ignoring his pout. I am the glittery unicorn. I provide the distraction; you provide the muscle for the gift pile. Now, let's review the perimeter. The presents are clustered on the mahogany sideboard. High visibility, low security. Bill leaned back, crossing his arms. What about the dogs? You said there was a Golden Retriever. He is a marshmallow in fur, Bill. He can be neutralized with a single piece of premium ham. Molly reached into her tactical vest and produced a vacuum-sealed slice of deli ham, holding it up like a holy relic. I have already sourced the bait. Now, the extraction. Once the loot is bagged, we exit via the hedge row. If we are spotted, we deploy the smoke bombs. Bill frowned, staring at the small canisters. Smoke bombs? At a residential property in the suburbs? We will have the SWAT team on us before we hit the driveway. It creates a cinematic diversion, Molly argued, her eyes gleaming with unnecessary ambition. Besides, the kids will think it is part of the magic show. It is the perfect crime. Bill sighed, realizing there was no arguing with a woman who had a color-coded spreadsheet for stealing LEGO sets and Barbie Dreamhouses. Fine, but I want an extra cut for the allergy medication. Molly scribbled a note on the map with a red marker. Deal. But if you trip over your panda paws and knock over the punch bowl, I am leaving you behind to explain yourself to the mothers. Bill stood up and attempted to pull the panda head over his mask, resulting in a muffled thud as he walked straight into the kitchen doorframe. I think we need to rehearse the walk, he mumbled from inside the plush head.
created_at
"2026-05-10T13:10:32.928Z"
5/10/2026, 1:10:32 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
695
moderation_latency
(null)
generation_time
44181
tokens_prompt
77
tokens_completion
694
native_tokens_prompt
79
native_tokens_completion
586
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00024546
router
(null)
provider_responses
0
endpoint_id
"29b94af5-d8f8-419a-9cf1-ae1f2831142b"
is_byok
false
latency
100
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Venice"
status
429
1
endpoint_id
"9d4644e9-a5e5-41f4-b810-d75f484d2c24"
is_byok
false
latency
18
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Parasail"
status
429
2
endpoint_id
"69279254-6a96-4e9d-abc6-ea76e6bf8e97"
id
"f0603c026dc19aa04d27c4ba74851872"
is_byok
false
latency
544
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778418632-AHnVThOm5wy9iwinR0lb"
session_id
(null)
api_type
"completions"
id
"gen-1778418632-dS2BoZAvghbaXpL400UV"
upstream_id
"f0603c026dc19aa04d27c4ba74851872"
total_cost
0.00024546
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.49%
Matches word count
n/a
neededClean
false
words
473
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
479
wordsDialogue
0
0.2460%