NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR90VKTZW4BEFBCJKKBBFW90
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 22s
Cost:
$0.000982
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint of the suburban living room across the dashboard of the stolen sedan. She tapped a red circle around the gift table with a manicured nail. Listen, Bill, the window of opportunity is precisely twelve minutes. Bill scratched his chin and leaned back into the leather seat, looking skeptical. Twelve minutes? Molly, it is a six-year-old's birthday. They will be eating cake for an hour. Molly snapped her head toward him, her eyes narrowing. The cake is a distraction. Once the sugar rush hits, the parents will be too busy preventing a living room riot to notice two adults in oversized mascot suits exiting via the patio. Bill looked at the plush costume draped over the back seat. I am not wearing the bunny suit again. It chafes in places I did not know I had. Molly did not blink. You will wear the bunny suit or you will get nothing from the haul. Do you know what is in those bags? Three latest-model consoles and a gold-plated LEGO set. Bill shifted his weight, his interest piqued. Is a gold-plated LEGO set even legal? Molly rolled her eyes and sighed. Everything is legal if you are fast enough. Now, the entry point. We slide through the laundry room. The dog is a poodle; he is useless. Bill looked out the window at the bright balloons tied to the mailbox. What about the clown? He looks shifty. Molly checked her watch. The clown is a freelance hire from a strip mall. He is not security, Bill. Bill muttered to himself. I do not trust anyone in face paint. Molly pointed to the clock on the dashboard. At 2:00 PM, the piñata drops. That is our signal. Bill blinked. The piñata? I thought we were using the radio. Molly leaned in close. Radios are too obvious. The scream of twenty children fighting over saltwater taffy is the perfect acoustic cover. Bill nodded slowly, accepting his fate. Fine. But I want the bunny ears to be breathable. Molly pushed him toward the door. Just get in the suit. And for heaven's sake, do not try to eat any of the snacks. We are professionals. Bill paused with one foot on the pavement. But it is a themed party. I heard there are mini-sliders. Molly groaned. Focus, Bill! We enter, we bag the gifts, we vanish. No sliders, no cake, no befriending the children. Bill sighed. What if they want to play a game? Molly stared at him with cold intensity. If a child asks you to play musical chairs, you keep moving toward the gift table. Bill smiled. Got it. Operation Sugar Crash is a go.
created_at
"2026-05-10T13:22:35.497Z"
5/10/2026, 1:22:35 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
247
moderation_latency
(null)
generation_time
81522
tokens_prompt
77
tokens_completion
2506
native_tokens_prompt
82
native_tokens_completion
2556
native_tokens_completion_images
(null)
native_tokens_reasoning
1821
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00098194
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-Rva3eRY87Bv8DQZuYEG6nK9V"
is_byok
false
latency
57
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778419355-A1Y8jGxKzFPtoqvT6DDY"
session_id
(null)
api_type
"completions"
id
"gen-1778419355-VtHddAfRyEzrg1MObWqS"
upstream_id
"chatcmpl-Rva3eRY87Bv8DQZuYEG6nK9V"
total_cost
0.00098194
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
446
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
452
wordsDialogue
0
0.0000%