NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYAK1HZBH4A420GTQJTJF8Q
Run Details
Model:
Gemma 3 27B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
23.6s
Cost:
$0.000115
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The chipped Formica of the diner booth felt sticky under Molly’s elbows. She stabbed a rogue French fry with a fork. “Okay, run it by me again, Bill. Just so I’m absolutely clear on the level of ridiculousness we’re operating at.” Bill, meticulously folding and refolding his napkin into a perfect swan, sighed. “It’s not ridiculous, it’s… niche. A highly specific market. Untapped potential.” “We’re stealing presents from a six-year-old’s birthday party.” “Precisely. And not just *any* six-year-old. Barnaby Butterfield the Third. His father owns Butterfield Robotics. Think high-end, remotely operated…lawn gnomes. Extremely lucrative lawn gnomes.” Molly blinked. “Lawn gnomes funding a children’s party? That’s…sad.” “Sad doesn't put money in our pockets, Molly. These aren't your dime-store plastic nightmares. We’re talking limited edition, handcrafted, possibly even sentient lawn gnomes. The presents will be proportionally impressive. Lego castles, miniature train sets, remote-controlled dinosaurs. Gold, Molly, gold!” “Okay, gold. I see. But a party? Really?” Bill stopped mid-swan-fold. “What else am I supposed to do? Crack a bank vault dressed as a clown? This is low risk, high reward. Think about it. Parental distraction, sugar-fueled chaos, a mountain of desirable goods. We go in, grab the gifts, and blend in with the departing guests.” “Blend in? Bill, we're wearing all black. Even for a heist, it’s a bit… on the nose.” “Details, details. We'll look like responsible party chaperones. Very serious about attending children’s birthdays.” He resumed his origami. “The plan is simple. You create a diversion. Something believable. A lost balloon, a rogue pinata, a spontaneous flash mob.” “A flash mob? Where am I going to find a flash mob on twenty-four hours notice?” “Improvise! That's what you're good at. Sing a sea shanty. Juggle kittens. I don’t care. Just get their attention. Meanwhile, I will quietly liberate the presents.” “Liberate? You mean steal.” “Semantics! I will quietly *relocate* the presents to a more secure location. Our van.” Molly picked at her plate. “And what about the parents? Surely they’ll notice a colossal pile of presents vanishing.” “That's where the bags come in. Large, brightly coloured, themed gift bags. We replace the stolen gifts with…lesser items. Socks. Fruit. Educational pamphlets about the proper care of lawn gnomes.” “You’ve thought of everything, haven’t you?” Molly said, a slow grin spreading across her face. “I always do. Now, are you in or are you going to spend another afternoon complaining about the scarcity of decent pickpocketing opportunities?” “Fine. I’m in. But if I have to sing a sea shanty, I’m blaming you for everything.”
created_at
"2026-02-20T20:06:50.696Z"
2/20/2026, 8:06:50 PM
model
"google/gemma-3-27b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
427
moderation_latency
(null)
generation_time
23541
tokens_prompt
93
tokens_completion
761
native_tokens_prompt
75
native_tokens_completion
682
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00011512
router
(null)
provider_responses
0
endpoint_id
"8f22002c-c045-446f-a1b9-9896133536b8"
id
"chatcmpl-RekrlKwOtIvuISyQpa3HPPRD"
is_byok
false
latency
58
model_permaslug
"google/gemma-3-27b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771618010-clLdEM6AbxsYY6P9yjV5"
upstream_id
"chatcmpl-RekrlKwOtIvuISyQpa3HPPRD"
total_cost
0.00011512
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
432
0.00%
Dialogue to Total Word Ratio
Ratio: 86.57%, Deviation: 36.57%
neededClean
false
wordsTotal
432
wordsDialogue
374
0.0000%