NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR90QTDXDT3F5C3AR2RQZYNX
Run Details
Model:
Gemma 4 26B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
23.8s
Cost:
$0.000275
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprint was drawn in neon pink crayon on the back of a grocery receipt, which Bill felt was an unnecessary touch of flair, but Molly insisted it provided the necessary psychological edge. They sat hunched over a sticky folding table in the back of a parked minivan, the scent of stale french fries and desperation hanging heavy in the air. "I am telling you, Bill, the perimeter is soft," Molly whispered, tapping a jagged circle around a drawing of a bouncy castle. "The security is a six-year-old in a dinosaur costume and a distracted mother with a half-eaten cupcake." Bill adjusted his tactical turtleneck, which was slightly too tight around the midsection. "It’s a high-density zone, Molly. The chaos factor is off the charts. One spilled juice box and we’re wading through a sticky tactical nightmare. We need heavy-duty suction cups for the floorboards." "We are stealing Lego sets and remote-controlled helicopters, not a diamond from a vault," Molly snapped, sliding a plastic spoon across the receipt. "We enter through the patio sliding door, bypass the clown, and move straight for the gift pile. It’s a surgical strike." "A clown is never just a clown," Bill countered, his eyes widening with genuine paranoia. "He’s a variable. He’s unpredictable. He could deploy balloon animals as distraction devices at any moment. We need a diversion. Something loud. Something involving glitter." Molly rubbed her temples, wondering if she could have picked a partner who didn't watch too many spy thrillers. "No glitter. Glitter is a forensic nightmare. It stays in your hair for three fiscal quarters. We use the nap time window. That is our golden hour." "Nap time is risky," Bill argued, leaning in closer. "That’s when the targets are most vigilant in their dreams. They could wake up at any second, screaming for juice, blowing our cover. We need a smoke screen. Maybe a localized fog machine?" "We are not bringing a fog machine to a toddler’s birthday party, Bill! We will look like we’re trying to rob a haunted house, not a suburban backyard!" "Fine, no fog. But I’m bringing the extra-large mesh bags. The heavy-duty kind. For the weight distribution of the stuffed animals." Molly sighed, staring at the crayon-drawn carnage of the party layout. "Just make sure you don't get distracted by the cake. I saw the invitation. It's a three-tier chocolate extravaganza. If you start snacking, the whole mission is compromised." Bill straightened his posture, regaining his professional dignity. "I am a professional, Molly. I don't eat the evidence." "You ate a bag of gummy bears during the reconnaissance phase at the toy store," she reminded him flatly. "Those were for energy!" Bill protested. Molly stood up, folding the receipt with grim determination. "Check your gear. If we aren't out by 1400 hours, we’re going to be stuck in a sing-along of Baby Shark. And I refuse to die like that."
created_at
"2026-05-10T13:20:31.172Z"
5/10/2026, 1:20:31 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2200
moderation_latency
(null)
generation_time
23723
tokens_prompt
77
tokens_completion
758
native_tokens_prompt
79
native_tokens_completion
663
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00027547
router
(null)
provider_responses
0
endpoint_id
"03b96766-ef7e-454f-9519-c76b70654b49"
id
"9e7393201e3395209c8ad357d91b49b7"
is_byok
false
latency
2047
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778419231-KqhB98vM1lB2FyYBkS2Y"
session_id
(null)
api_type
"completions"
id
"gen-1778419231-b1wyWyvWLEBLcQaGhMGG"
upstream_id
"9e7393201e3395209c8ad357d91b49b7"
total_cost
0.00027547
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
0.00%
Dialogue to Total Word Ratio
Ratio: 67.07%, Deviation: 37.07%
neededClean
false
wordsTotal
495
wordsDialogue
332
30.1376%