NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR929T339PAXMKAV0QXAVV9C
Run Details
Model:
Gemma 4 26B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
6.0s
Cost:
$0.000188
Provider:
Cloudflare
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the grease-stained diner table, stabbing a finger at a crudely drawn diagram of a suburban backyard. It looked less like a high-security fortress and more like a failed kindergarten art project, which was precisely the problem. We need to account for the perimeter defense, Molly whispered, her eyes narrowing. Bill squinted at the drawing, chewing thoughtfully on a cold fry. You mean the inflatable bouncy castle? Because that looks like a serious obstacle, Molly. If we step on it, we’ll bounce straight into the stratosphere. It’s not about the bounce, Bill, it’s about the noise, she hissed, leaning in closer. The moment a heavy-set thief hits a vinyl castle, the entire neighborhood knows there’s a felony in progress. We approach via the hydrangea bushes. The hydrangeas are three inches tall, Molly, Bill countered, gesturing wildly at the map. We’ll look like two giants playing hide-and-seek in a salad. We need a diversion. Something loud. Something distracting. Molly tapped her chin. A smoke bomb? Too dramatic, Bill said, shaking his head. We need something subtle. How about we release a swarm of highly agitated hamsters? Or perhaps we just tell the parents we’re the entertainers? I can do a mediocre magic trick. I can make a coin disappear. It’s a very useful skill for a thief. You are not entering a six-year-old’s birthday party in a sequined waistcoat, Bill, Molly snapped. We are professionals. We go in, we hit the gift pile, and we vanish before the cake is sliced. What about the target? Bill asked, his eyes gleaming with greed. Is it the high-value stuff? Molly sighed, glancing at the inventory list. It’s a mix. We’re looking at several Lego sets, a suspiciously expensive remote-controlled drone, and possibly a limited edition handheld gaming console. Bill leaned back, a grin spreading across his face. Now we’re talking. That’s a payday. But what’s the exit strategy? If the birthday boy sees us, he’s going to scream. Kids have lungs like sirens. We use the clown, Molly said firmly. The clown? Bill asked, looking skeptical. You want to use a clown as a tactical shield? He’s the perfect chaos element, Molly argued, her voice rising in pitch. No one looks at the thieves when a man in oversized shoes is tripping over a juice box. He is our smoke screen. Bill stared at her for a long moment, then slowly nodded. Fine. But if the clown turns out to be a narc, I’m blaming you. Deal, Molly said, folding the map. Now, let's go find a clown who doesn't ask too many questions.
created_at
"2026-05-10T13:47:49.228Z"
5/10/2026, 1:47:49 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
758
moderation_latency
(null)
generation_time
5609
tokens_prompt
77
tokens_completion
675
native_tokens_prompt
79
native_tokens_completion
601
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.0001882
router
(null)
provider_responses
0
endpoint_id
"e2a16fa6-1f65-413e-9ecb-93901089ddb1"
id
"id-1778420869937"
is_byok
false
latency
757
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"Cloudflare"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778420869-RXHSTBwQkrvgEUDHG54r"
session_id
(null)
api_type
"completions"
id
"gen-1778420869-mcI4HdWyqqOMr8gc20v2"
upstream_id
"id-1778420869937"
total_cost
0.0001882
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Cloudflare"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
435
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
444
wordsDialogue
0
0.0000%