NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR935069RGSR6TP3M84864NY
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5m 1s
Cost:
$0.002056
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
97.53%
Dialogue to Total Word Ratio
(details)
89.4020%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint of the suburban split-level across the greasy table. She pointed a laser pointer at the living room, which she had labeled The Kill Zone in aggressive red ink. Bill was eating a bag of cheese puffs, leaving bright orange fingerprints all over the master plan. He looked less like a professional thief and more like a man who had lost a fight with a snack vending machine. "The perimeter is guarded by a golden retriever with a taste for ankles," Molly whispered. Bill swallowed hard, glancing at the diagram of the gift table. He didn't care about the dog; he cared about the logistics of the dessert table. "What about the clown, Molly? You said there was a clown." Molly sighed, her patience wearing thin. She had spent three days observing the target house from a nearby hedge, wearing a ghillie suit that made her look like a very anxious shrub. She knew the clown's schedule perfectly. He arrived at two, performed balloon animals for thirty minutes, and then retreated to the garage to smoke in peace. "The clown is a distraction, Bill. Focus. We enter through the laundry room, neutralize the dog with a piece of premium ham, and secure the perimeter." Bill frowned, leaning in. He pointed to a small cluster of circles on the map, representing the guest list. "And the toddlers? They're like piranhas in party hats." "Do we have a plan for the parents?" Molly looked at him with genuine pity. "They will be in the kitchen discussing the housing market. They are functionally blind." Molly's eyes glinted with tactical fervor. She reached into her leather bag and produced a high-powered, industrial-grade bubble machine. "Deployment of the Bubble Screen at 14:15. While they are mesmerized by the iridescent spheres, we strike the gift pile." Bill looked skeptical. He had been in the game for ten years, but this was the first time he had been asked to utilize soap-based weaponry. He wondered if the presents were actually worth the risk of being tackled by a sugar-crazed six-year-old. "Is it really worth it? It's just LEGOs and Barbie dolls." Molly slammed her hand on the table, sending a cloud of cheese puff dust into the air. She leaned in close, her voice a low, dangerous hiss. "It is about the principle, Bill. And the resale value of a limited edition Star Wars set is astronomical." Bill sighed and nodded. "I just don't like the bubble machine. It feels unprofessional." He adjusted his tactical vest, which was slightly too small and pinched his midsection, and reached for another orange puff. "Fine. But I'm taking a slice of cake before we leave." Molly rolled her eyes and began marking the extraction route with a neon green marker. They would exit through the sliding glass door, avoiding the patio furniture, and vanish into the hedges before the first temper tantrum reached its peak.
created_at
"2026-05-10T14:02:40.211Z"
5/10/2026, 2:02:40 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
189
moderation_latency
(null)
generation_time
295419
tokens_prompt
77
tokens_completion
5743
native_tokens_prompt
82
native_tokens_completion
5382
native_tokens_completion_images
(null)
native_tokens_reasoning
5000
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00205582
router
(null)
provider_responses
0
endpoint_id
"576fca5b-f09c-4bda-b560-9989c39ac3ab"
is_byok
false
latency
5168
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Together"
status
429
1
endpoint_id
"29b94af5-d8f8-419a-9cf1-ae1f2831142b"
is_byok
false
latency
108
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Venice"
status
429
2
endpoint_id
"69279254-6a96-4e9d-abc6-ea76e6bf8e97"
is_byok
false
latency
55
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Novita"
status
429
3
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-R7X6GOehzg2m5njpa9SQmAMz"
is_byok
false
latency
59
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778421760-lxbPJg1kiEL4yQwHTZgO"
session_id
(null)
api_type
"completions"
id
"gen-1778421760-ZHKAdpcboUsYjBrkFQyK"
upstream_id
"chatcmpl-R7X6GOehzg2m5njpa9SQmAMz"
total_cost
0.00205582
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
488
97.53%
Dialogue to Total Word Ratio
Ratio: 30.71%, Deviation: 0.71%
neededClean
false
wordsTotal
495
wordsDialogue
152
89.4020%