NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8W5XR2NG2RVM2C202T5MZ0
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 3s
Cost:
$0.000894
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.21%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.1070%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread a hand-drawn map of the living room across the greasy diner table. Are we really doing this, Molly? Bill asked, staring at the map with profound skepticism. It is a goldmine, Bill. Look at the guest list. Six high-income families. The sheer volume of LEGO sets alone is staggering, Molly replied. It is a seven-year-old's party. We are robbing children, Bill said. We are redistributing wealth from people who cannot even tie their own shoes, Molly countered. What about the security? Bill asked. Two golden retrievers and a distracted mother with a tray of mini-quiches. It is practically an open door, Molly said. And the disguises? Please tell me I am not wearing a tutu, Bill groaned. You are the Magic Balloon Man. I am the Party Coordinator. We enter through the side gate at 1400 hours, precisely when the cake is served, Molly explained. The cake is the distraction? Bill asked. Exactly. While the children are in a sugar-induced stupor, we sweep the gift table. We move fast, we move quiet, and we avoid the juice boxes, Molly instructed. Why avoid the juice boxes? Bill asked. Because they are sticky, Bill. We cannot have sticky fingerprints at the crime scene, Molly snapped. I do not know. The Magic Balloon Man seems high-risk. What if a kid wants a dog? I cannot make a dog, Bill said. Just make a sausage. They are basically the same shape. Focus, Bill! We are talking about the limited edition Star Wars sets here, Molly insisted. What about the professional clowns? Bill asked. There are no hired clowns, Bill. That is why we are sneaking in. We are providing the entertainment for free, in exchange for the merchandise, Molly said. Do you think the parents will notice? Bill asked. By the time they realize the presents are gone, we will be three zip codes away in the getaway wagon, Molly said. A wagon? You want me to escape in a Radio Flyer? Bill asked. It is low profile, Bill. No one suspects a man in a sequined vest pushing a red wagon full of toys, Molly said. I feel like everyone will suspect the man in the sequined vest, Bill sighed. Stop whining and practice your Tada face, Molly commanded. Tada, Bill muttered without any enthusiasm. More pizzazz, Bill! Give me the energy of a man who is about to steal a scaled-down Millennium Falcon, Molly shouted. I just want to go home, Bill whispered. You can go home once we have the loot. This is a tactical acquisition, Bill. Now, hold this balloon and try not to pop it, Molly said. This is the worst heist in history, Bill said. It is the most festive heist in history, Molly corrected. Now, let us go over the perimeter of the sandbox one more time.
created_at
"2026-05-10T12:00:50.441Z"
5/10/2026, 12:00:50 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1330
moderation_latency
(null)
generation_time
123286
tokens_prompt
77
tokens_completion
2379
native_tokens_prompt
82
native_tokens_completion
2324
native_tokens_completion_images
(null)
native_tokens_reasoning
1669
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00089378
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-RbcadojegdnGhZGI3rUM2NY3"
is_byok
false
latency
63
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778414450-MryN9qLuD4KstBm5Eire"
session_id
(null)
api_type
"completions"
id
"gen-1778414450-qBubtIlxbwnqDq1UPFkw"
upstream_id
"chatcmpl-RbcadojegdnGhZGI3rUM2NY3"
total_cost
0.00089378
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.21%
Matches word count
n/a
neededClean
false
words
472
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
480
wordsDialogue
0
0.1070%