NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8YQ3MC5DE375K12BK8WAV8
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
2m 18s
Cost:
$0.001355
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
60.09%
Dialogue to Total Word Ratio
(details)
30.0464%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill gripped the steering wheel of the rusted sedan, staring at the brightly colored balloons tied to the mailbox of 42 Maple Lane. The house was vibrating with the high-pitched screams of twenty sugar-fueled seven-year-olds. Beside him, Molly was meticulously checking her gear, which consisted mostly of an oversized sweater and a very large laundry bag. She viewed the suburban birthday bash as a high-security vault, albeit one guarded by a man in a polyester giraffe suit. "The giraffe is the primary threat," Molly whispered, clicking her stopwatch. "He has a wide field of vision and knows the layout of the living room." Bill sighed, rubbing his temples. "Molly, they are children. We just walk in, grab the Lego sets, and leave. Why are we treating this like the Louvre?" "Complacency is the killer of the professional, Bill. Now, remember the signal. If I cough twice, we pivot to the kitchen and evacuate through the sliding glass doors." Bill looked at the laundry bag and then at his own attire, which was a vaguely convincing party hat taped to his forehead. The plan was simple yet daring: blend in during the cake-cutting frenzy, create a diversion with the bubble machine, and clear the gift table in under ninety seconds. Molly had spent three days studying the floor plan, which she had painstakingly drawn based on a photo from the mother's public Instagram page. She noted the strategic placement of the beanbag chairs, the precarious pile of presents, and the dangerous proximity of the dog's water bowl to the main exit. "What about the cake?" Bill asked. "Do we take the cake?" "Focus, Bill. The cake is a distraction. We are here for the high-value plastics and the gift cards. We are not gluttons; we are specialists." "I just think a slice of chocolate fudge would be a nice bonus for the risk." "Do you have the bag?" Bill asked. "I have the bag, Bill." Molly gave him a look of utter disappointment. She checked her watch again, noting that the party had hit the peak chaos window. The sounds of a popping balloon echoed from inside, followed by a collective shriek of terror and excitement. This was the opening they needed. She slid the car into park and checked the tension on her bag strap. "Ready for the drop?" she asked. "I hate the suburbs and I hate these tiny hats," Bill muttered, opening the door. "Shut up, put on your game face, and look festive." They stepped out into the sunlight, two seasoned criminals preparing to plunge into a sea of confetti and juice boxes. The mission was clear, the stakes were remarkably low, and the potential for a Lego-based windfall had never been higher in the history of suburban crime.
created_at
"2026-05-10T12:45:10.676Z"
5/10/2026, 12:45:10 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1091
moderation_latency
(null)
generation_time
128045
tokens_prompt
77
tokens_completion
3794
native_tokens_prompt
82
native_tokens_completion
3537
native_tokens_completion_images
(null)
native_tokens_reasoning
3103
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00135472
router
(null)
provider_responses
0
endpoint_id
"69279254-6a96-4e9d-abc6-ea76e6bf8e97"
is_byok
false
latency
10078
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Novita"
status
504
1
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-R2VyrO30XHamO1Uge0J4M17f"
is_byok
false
latency
52
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778417110-jEqVA01HaKNqtmDMo9T4"
session_id
(null)
api_type
"completions"
id
"gen-1778417110-y4mHr3UDfPKYEMMHw4RN"
upstream_id
"chatcmpl-R2VyrO30XHamO1Uge0J4M17f"
total_cost
0.00135472
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
462
60.09%
Dialogue to Total Word Ratio
Ratio: 33.19%, Deviation: 3.19%
neededClean
false
wordsTotal
470
wordsDialogue
156
30.0464%