NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9V3WHG72Y8GXYQRN9NXRY
Run Details
Model:
Gemma 3 27B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
23.8s
Cost:
$0.000137
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The chipped Formica of the diner booth felt cold under Molly’s elbows. She stabbed a French fry with unnecessary force. Bill, across from her, was meticulously constructing a miniature fortress out of sugar packets. It was a disconcerting habit when he was stressed. And he was definitely stressed. “So, let’s go over it again,” Molly said, her voice flat. “Because frankly, Bill, the thought of explaining to The Serpent why we almost got taken down by a bouncy castle is not appealing.” Bill didn’t look up. “It’s a very structurally sound design, you know. Defensive fortifications are key to any successful operation.” “We’re stealing presents, not conquering Westeros.” He finally looked up, blinking. “Right. Presents. Okay. The ‘Operation Birthday Bonanza’ rundown. Seven-year-old Timmy Henderson. Bouncing, screaming hordes. Highly concentrated gift density. Parental units distracted by cake and questionable party favors.” “Questionable how?” “They’ve got those little plastic rings you have to get into a bottle. Someone will be losing it. Guarantee it. Emotional vulnerability is our friend.” Bill patted his sugar packet citadel. “Now, the access point. Back garden, relatively unguarded. Mrs. Henderson appears to be a firm believer in the ‘let them run wild’ parenting philosophy. We go in during the pin the tail on the donkey phase. Maximum chaos.” Molly sighed. “Pin the tail on the donkey. It’s always the simple things. What about security? Does Timmy have a guard dog? A laser grid protecting the loot?” “Worse.” “Worse than a laser grid?” “A clown.” Bill shuddered dramatically. “Full face paint. Balloon animals. Unpredictable movements. I’ve done my research, Molly. This clown, ‘Buttons,’ is a legend. Known for aggressive confetti deployment.” “Confetti? Seriously?” “It gets *everywhere*. It compromises visibility. It’s a tactical nightmare.” He added another sugar packet to his fortress, clearly bolstering its defenses. “We need to be in and out before Buttons launches his final attack. Ten minutes, tops.” “Ten minutes to sift through a mountain of Barbie Dreamhouses and Lego sets? You know those parents buy things based on sheer cubic footage, right?” “Strategic prioritization, Molly! We target the electronics. Anything shiny, small, and easily fence-able. No My Little Ponies. No art supplies. Strictly high-value targets.” “And the getaway?” “The inflatable slide. We blend in, become one with the joyous, unsupervised mayhem. It’s brilliant, I tell you. Brilliant!” He beamed, finally looking pleased with himself. “You want us to slide out of a children’s party with a bag of stolen iPads?” Molly asked, raising a skeptical eyebrow. “Think of the irony, Molly! The glorious, beautiful irony!” Bill winked. “Plus, it's fast. And fun. Relatively.”
created_at
"2026-02-20T19:53:46.652Z"
2/20/2026, 7:53:46 PM
model
"google/gemma-3-27b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1240
moderation_latency
(null)
generation_time
23768
tokens_prompt
93
tokens_completion
783
native_tokens_prompt
75
native_tokens_completion
642
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000137325
router
(null)
provider_responses
0
endpoint_id
"cf530f36-7cb8-4eb9-8953-880eef952af3"
id
"3459e8e8-7b9b-4f19-b408-6c1677c6abec"
is_byok
false
latency
1155
model_permaslug
"google/gemma-3-27b-it"
provider_name
"Novita"
status
200
api_type
"completions"
id
"gen-1771617226-PSdEMO9lcBV3LTMfiNhG"
upstream_id
"3459e8e8-7b9b-4f19-b408-6c1677c6abec"
total_cost
0.000137325
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
432
0.00%
Dialogue to Total Word Ratio
Ratio: 76.85%, Deviation: 46.85%
neededClean
false
wordsTotal
432
wordsDialogue
332
0.0000%