NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR90HBGG802M86GZGVZMG249
Run Details
Model:
Gemma 4 26B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
9.1s
Cost:
$0.000201
Provider:
Cloudflare
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the sticky kitchen table, tapping a finger on a drawing of a bouncy castle. We strike at fourteen-hundred hours, she said. Bill squinted at the paper. Is that a tactical diagram or a coloring book? It is the layout of the Miller residence, Molly snapped. Look, the target zone is the living room. That is where the gift pile will be staged. The gift pile? Bill asked, leaning back. You mean the loot? The presents, Bill. The high-value, wrapped, ribbon-adorned loot, she corrected. What kind of security are we looking at? Bill asked. Laser grids? Pressure plates? Motion-sensing toddlers and a hyperactive golden retriever, Molly replied. That sounds significantly more dangerous than a bank vault, Bill muttered. Do we have a diversion? I have the clown, Molly said. The clown? Bill’s eyes widened. You hired a clown? He is our inside man, she explained. He performs the slapstick routine in the backyard, drawing the parental perimeter away from the hallway. Is he professional? Bill asked suspiciously. He is unpredictable, which is a tactical advantage, Molly said. And the getaway? Bill pressed. I have a minivan with tinted windows and a large supply of juice boxes, she said. Juice boxes? Bill scoffed. Is that our extraction protocol? It is essential for maintaining low blood sugar during a high-stress retreat, Molly insisted. What if a kid sees us? Bill asked. What if a five-year-old points a sticky finger and yells, Look, robbers! We wear the disguises, Molly said, pulling two bright, oversized party hats from a bag. Bill stared at the glittery cone of felt. You want us to infiltrate a heist wearing party hats? They are tactical headgear, Molly whispered. They camouflage us within the festive environment. It’s ridiculous, Bill said, though he reached for a hat. What’s the split? Sixty-forty, Molly stated. Sixty-forty? Bill cried. For a bouncy castle job? That is highway robbery! I am doing the heavy lifting with the clown coordination, she argued. And I am the one who has to deal with the dog! Bill countered. The dog is a non-combatant, Molly insisted. He looks like a heavy hitter to me, Bill muttered. What is the first objective once we breach the perimeter? Secure the Lego sets and the high-end electronics, Molly commanded. Leave the stuffed animals. They are too bulky for the extraction vehicle. The stuffed animals? Bill asked. Not even a teddy bear? They have no resale value, Bill, she sighed. Fine, Bill said, pulling the glittery hat over his ears. But if I get tackled by a toddler in a dinosaur costume, I am taking a larger cut. Agreed, Molly said, checking her watch. Now, grab the juice boxes. We move at fourteen-hundred.
created_at
"2026-05-10T13:16:59.288Z"
5/10/2026, 1:16:59 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
744
moderation_latency
(null)
generation_time
8930
tokens_prompt
77
tokens_completion
734
native_tokens_prompt
79
native_tokens_completion
643
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.0002008
router
(null)
provider_responses
0
endpoint_id
"e2a16fa6-1f65-413e-9ecb-93901089ddb1"
id
"id-1778419019995"
is_byok
false
latency
719
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"Cloudflare"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778419019-K0dlvhFk1dW7eLiLheQh"
session_id
(null)
api_type
"completions"
id
"gen-1778419019-Ydt38wBftiRqGWx9YpTC"
upstream_id
"id-1778419019995"
total_cost
0.0002008
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Cloudflare"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
455
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
467
wordsDialogue
0
0.0000%