NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR92X0TH8MX91YWWQJ3VPTAS
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 9s
Cost:
$0.001094
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crude map across the dashboard of the beat-up sedan. "Okay, Bill, focus. The target is the living room. Seven-year-olds are unpredictable, but we can handle them." "Molly, we are literally robbing a second grader. Where is the line?" "The line is wherever the limited edition Star Wars LEGOs are, Bill. Now, look at the floor plan." "But there is a clown. A professional one." "Exactly. The clown is our greatest asset. While he is twisting a balloon dog, we strike. He is the perfect distraction." "What about the parents?" "They will be in the kitchen arguing over the gluten-free cupcakes and the lack of organic juice. We have a four-minute window." "I do not like the window. It is too tight." "Tight is efficient. Now, did you bring the diversion?" "The bubble machine?" "Yes, the bubble machine. Once we hit the switch on the patio, the kids will swarm the yard. The presents will be left unguarded in the den." "I feel like a monster, Molly. A genuine monster." "You are a professional, Bill. Professionals do not have feelings about plastic bricks." "But they are building a Death Star! A scale model!" "Which is exactly why we are taking it. The resale value on the black market for collectors is astronomical. Now, put on the costume." "I am not wearing the giant panda suit." "The panda suit is the key to our anonymity." "I am six-foot-four. I am not a subtle panda." "You are a festive panda. Now get in there before the cake is served. Once the sugar hits their bloodstreams, it is anarchy." "What about the dog? The flyer said there is a Golden Retriever." "A Golden Retriever is not a threat, Bill. It is a fluffy distraction. You just give it a treat, and it will follow you anywhere." "I do not have any treats." "Check your left pocket. I put a bag of organic liver snaps in there. Now, what if someone recognizes you?" "That is exactly my point! I am a known figure in the neighborhood." "Who recognizes a six-foot-four man in a panda suit at a seven-year-old's party? Nobody. You are a ghost in faux-fur." "I hate this job. I really do." "You love the thrill, Bill. You love the risk. Now, go. For the LEGOs!" Bill sighed and pulled the heavy polyester head over his face. "I cannot see anything," he mumbled from inside the mask. "That is the price of greatness," Molly replied, checking her watch. "Now move. The bubble machine is primed." "If I get stuck in a balloon arch, I am quitting." "Just get the boxes, Bill!"
created_at
"2026-05-10T13:58:18.713Z"
5/10/2026, 1:58:18 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1166
moderation_latency
(null)
generation_time
129158
tokens_prompt
77
tokens_completion
2875
native_tokens_prompt
82
native_tokens_completion
2852
native_tokens_completion_images
(null)
native_tokens_reasoning
2204
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00109442
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-Rjl2S9s2rqkKKAHtJ9Khft0y"
is_byok
false
latency
45
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778421498-CVRB4SO4sWT9E02OdHRY"
session_id
(null)
api_type
"completions"
id
"gen-1778421498-CMwfHRpoclEcPvOQ0giA"
upstream_id
"chatcmpl-Rjl2S9s2rqkKKAHtJ9Khft0y"
total_cost
0.00109442
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
440
0.00%
Dialogue to Total Word Ratio
Ratio: 92.26%, Deviation: 22.26%
neededClean
false
wordsTotal
452
wordsDialogue
417
0.0000%