NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8VEW775SX26766HACW49GA
Run Details
Model:
Gemma 4 31B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
58.5s
Cost:
$0.000263
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Are you telling me the security system is a golden retriever?" Molly asked, leaning over the blueprint of the suburban living room. "His name is Buster, and he is a menace," Bill replied, tapping a pencil against his chin. "He’s high-energy, motivated by tennis balls, and has a bark that can wake the dead." "I can handle a dog, Bill. Can we focus on the perimeter? What’s the entry point?" "The sliding glass door. It’s unlocked between three and four o'clock while the parents are frantically inflating balloons in the kitchen." Molly frowned. "Wait, why are we stealing children's presents? This is a new low, even for us. We stole a vintage toaster from a museum last month." "It’s not about the sentiment, Molly, it’s about the volume. Think about it. LEGO sets, remote-controlled drones, high-end tablets. It’s a concentrated hub of high-resale electronics. It’s practically a boutique store that delivers itself to one living room." "And the disguise?" "We go in as the entertainment," Bill said, gesturing to a pile of oversized neon polyester. "I’m the magician, you’re the assistant. We provide ten minutes of mediocre card tricks to distract the toddlers, then we execute the 'Great Gift Vanishing Act' while the cake is being sliced." "I am not wearing a sequined leotard, Bill." "It’s a professional costume! It commands respect." "It commands a lawsuit from the fashion police. And what about the children? They’re chaotic. They’re small. They’re basically tiny, unpredictable security guards." "Exactly. That’s why we bring the bribe." Bill produced a massive bag of gourmet lollipops. "One pop per child. We create a sugar-induced stupor, sweep the presents into the oversized magic hat, and exit through the patio before the first tantrum hits." "What if a parent asks for your credentials?" "I’ll tell them I’m from the 'Association of Slightly Confused Illusionists.' No one questions a man in a cape; they just assume he’s eccentric." "And if Buster catches us?" "I’ve got a pocket full of premium organic beef jerky. He’ll be my best friend for life." Molly sighed, looking at the sequined outfit. "If I get sticky residue on my boots, I’m charging you double for this job." "Deal. Now, practice your 'ta-da' face. We need to look convincing." "I can’t do a 'ta-da' face while I’m contemplating my life choices, Bill." "Just pretend you’re looking at a pile of cash. That usually works." "Fine. But I’m taking the LEGO Star Wars set for myself." "Touch the Millennium Falcon and you're walking home."
created_at
"2026-05-10T11:48:15.349Z"
5/10/2026, 11:48:15 AM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3019
moderation_latency
(null)
generation_time
58203
tokens_prompt
77
tokens_completion
686
native_tokens_prompt
79
native_tokens_completion
631
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00026346
router
(null)
provider_responses
0
endpoint_id
"69279254-6a96-4e9d-abc6-ea76e6bf8e97"
id
"4b8e8cee9ec4b483afdbb075a67bb41b"
is_byok
false
latency
2868
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778413695-q2C2a8Mr1kzyV8LcUwTH"
session_id
(null)
api_type
"completions"
id
"gen-1778413695-pErGqk15xThcD420DoFM"
upstream_id
"4b8e8cee9ec4b483afdbb075a67bb41b"
total_cost
0.00026346
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
418
0.00%
Dialogue to Total Word Ratio
Ratio: 89.18%, Deviation: 19.18%
neededClean
false
wordsTotal
425
wordsDialogue
379
0.0000%