NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8XB35X90HYYAZBDB39JVBP
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 17s
Cost:
$0.000834
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slammed a hand-drawn map onto the grease-stained table. "Behold the layout of the Miller residence," she announced. Bill squinted at the parchment. "Why is there a giant dinosaur in the kitchen?" "That is the golden retriever, Bill. Now focus." "He looks menacing," Bill whispered. "He’s a poodle-mix. He’s a marshmallow with a bark," Molly snapped. "The objective is the gift table. High-end Legos, a miniature drone, and possibly a gaming console. It’s a goldmine of plastic and electronics." "Do we get any cake?" Molly stared at him with a look of profound disappointment. "We are professional thieves, not party guests." "I’m just saying, it’s a waste of frosting to leave it behind." "We enter through the patio doors at exactly fourteen hundred hours," Molly continued, ignoring him. "This coincides with the piñata event. The noise and the subsequent sugar-induced riot will provide the perfect cover." "Is it a donkey piñata or a unicorn?" Bill asked. "I don't know and I don't care. While the children are fighting over salt-water taffy, we strike." "How do we get in without being spotted?" Molly reached under the table and produced two massive, neon-orange polyester jumpsuits. "Disguises. We go in as the hired entertainment." Bill poked a suit with a finger. "I can't fit my calves in these. They're too tight." "Your calves are not the tactical priority here, Bill. Stealth is." "Clowns aren't stealthy, Molly. They’re loud and terrifying." "Exactly. We hide in plain sight. We blend into the absurdity of a seven-year-old's birthday party. No one questions a clown carrying a large bag of toys." "What if a kid follows us?" Bill asked. "We have a diversion," Molly said, producing a handful of shiny red balloons. "We offer them a balloon, and they’ll be distracted for at least three minutes." "I want a red balloon too," Bill said. Molly sighed, rubbing her temples. "You are thirty-four years old, Bill." "I like the way they float." "I've also packed the tactical glitter bombs," Molly added. "Why glitter?" "Visual disorientation. It’s impossible to track a thief when your retinas are covered in sparkling pink dust." "That sounds like a nightmare to clean up." "That's the point, Bill. It's psychological warfare." "One last thing," Molly said, her voice turning serious. "The target's father is a semi-professional karate instructor. If he sees you, do not fight him." "I can take him," Bill bragged, flexing a bicep. "You can't even fit into the pants, Bill. Just run." "I still think we should check for dip," Bill muttered. "Just put on the suit," she groaned. "And for the love of God, don't try to make balloon animals. You’ll pop one and alert the perimeter." "I can do a dog," Bill insisted. "Get in the van."
created_at
"2026-05-10T12:21:08.421Z"
5/10/2026, 12:21:08 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
168
moderation_latency
(null)
generation_time
76934
tokens_prompt
77
tokens_completion
2168
native_tokens_prompt
82
native_tokens_completion
2167
native_tokens_completion_images
(null)
native_tokens_reasoning
1415
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00083412
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-RQ5rVu3dJ35KqR3FOMd3Yhwq"
is_byok
false
latency
52
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778415668-Bg8mWAtsCNQeycen7sva"
session_id
(null)
api_type
"completions"
id
"gen-1778415668-s8Psew0RJ5Gk6Gk9sojp"
upstream_id
"chatcmpl-RQ5rVu3dJ35KqR3FOMd3Yhwq"
total_cost
0.00083412
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
458
0.00%
Dialogue to Total Word Ratio
Ratio: 79.74%, Deviation: 29.74%
neededClean
false
wordsTotal
469
wordsDialogue
374
0.0000%