NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9PGH793Q540BH6EPX7EEX
Run Details
Model:
Gemma 3 27B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
24.6s
Cost:
$0.000311
Provider:
Parasail
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The chipped Formica of the diner booth stuck to Molly’s elbows. She was sketching a diagram of a bouncy castle on a napkin, labeling key points: “Air intake? Weak spot? Potential exit route?” Bill slurped his coffee, leaving a questionable brown ring in the bottom of the mug. “You’re overthinking the bouncy castle, Mol. It’s air filled. You poke it, it deflates. Not exactly Fort Knox.” “It’s a *diversion*, Bill. A crucial diversion. Think of the chaos. Small humans screaming, parents flailing, all while we’re upstairs liberating a treasure trove of LEGOs and video games.” Molly tapped the napkin with a pen. “Operation: Birthday Bandit is a delicate operation.” “Delicate? We’re stealing gifts from kids, Mol. It’s… decidedly undelicate.” “It’s *re-allocating* resources, Bill. Those children have too much. Think of the deserving. Think of our rent.” Molly glared. “And frankly, young Timothy Henderson is getting a pony. A *pony*. That’s just showing off.” “A pony! Seriously?” Bill choked on his coffee, spraying a fine mist across the table. “Okay, that changes things. A pony is excessive. Alright, let’s hear your grand plan then. Detail the bouncy castle strategy.” “Right. So, I go in disguised as a clown. Standard.” “Oh, please. Last time you were a clown, you terrified a toddler into a week-long silence.” “That toddler had *judgment*. Anyway, I cause a minor, yet convincing, ‘balloon animal malfunction’ right near the castle. Creates a bottleneck. You, meanwhile, go through the back, scale the rose trellis – you remember the trellis, right?” “Rose trellis. Ouch. Vaguely. I still have thorns in places I didn’t know existed.” “Focus, Bill! You bypass the parents distracted by the balloon-based pandemonium, head upstairs, and scoop up the gifts. We meet at the getaway vehicle – Brenda, naturally.” “Brenda still hasn’t forgiven us for the pigeon incident.” “Brenda understands the *necessity* of strategic vehicle deployment. And the pigeon was a lucky break. It bought us a full ten seconds.” Molly folded the napkin, her expression serious. “The key is timing. The party’s peak chaos is right after the cake. Sugar rush, parental exhaustion, pure opportunism for us.” “What about security?” “Mrs. Henderson. She’s the security. And she’s busy judging everyone’s parenting skills. She won't notice a thing.” Molly grinned. “Also, I’ve been to enough of these things. There’s always an aunt Mildred off in a corner, guarding the gift table like it’s the crown jewels. We just… circumvent Aunt Mildred.” “Circumventing Aunt Mildred sounds…complex. She’s surprisingly agile for someone who exclusively wears floral prints.” “Agility doesn’t stand a chance against a well-executed distraction, Bill. And the pony. Let’s not forget the pony.”
created_at
"2026-02-20T19:51:15.759Z"
2/20/2026, 7:51:15 PM
model
"google/gemma-3-27b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
342
moderation_latency
(null)
generation_time
24579
tokens_prompt
93
tokens_completion
779
native_tokens_prompt
75
native_tokens_completion
677
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00031065
router
(null)
provider_responses
0
endpoint_id
"2f608ade-87b1-46ed-8ae7-2714dddf2abb"
id
"cmpl-9337d23223b04ba8bcd528f5eb34720b"
is_byok
false
latency
272
model_permaslug
"google/gemma-3-27b-it"
provider_name
"Parasail"
status
200
api_type
"completions"
id
"gen-1771617075-Nv6VqeDsEHFXrTg4BCUx"
upstream_id
"cmpl-9337d23223b04ba8bcd528f5eb34720b"
total_cost
0.00031065
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Parasail"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
437
0.00%
Dialogue to Total Word Ratio
Ratio: 83.75%, Deviation: 33.75%
neededClean
false
wordsTotal
437
wordsDialogue
366
0.0000%