NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8ZYF8ABRDBGRZ14S3T48H0
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 23s
Cost:
$0.000964
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.03%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0152%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tapped the blueprint of the suburban living room with a manicured nail. "Okay, Bill, focus. The target is the mahogany sideboard in the sunroom." "Why are we stealing from six-year-olds, Molly? It feels wrong. My soul is itching," Bill said, leaning back in the van. "It is not stealing, Bill, it is redistributing joy. Besides, have you seen the gift lists for this zip code? There are miniature electric cars and high-end chemistry sets in there." "I just do not think I can pull off the costume. It is too much fabric." "It is a clown suit, Bill. It is the ultimate camouflage. Nobody looks at a clown's face; they only look at the balloon animals." "I cannot do balloon animals. I tried once and popped a balloon in my own ear." "Then just make circles. Call them abstract donuts or conceptual bubbles. Now, timing. The cake is served at 3:00 PM. That is our window." "The cake window. Got it." "Exactly. While the children are in a sugar-induced frenzy, we strike. I will be the Entertainment Coordinator distracting the parents with a choreographed dance, and you will be the Cleanup Crew clearing the table." "Wait, if I am the cleanup crew, why am I wearing a neon blue wig? Does that not scream I am stealing the presents?" "For the aesthetic, Bill! We need to blend in with the chaos. Now, the getaway. Is the ice cream truck ready?" "The engine makes a clicking sound. Also, it smells like old vanilla and desperation." "Perfect. Nothing is less suspicious than an ice cream truck idling in a driveway during a birthday party. We load the boxes, we hit the siren, and we vanish into the suburbs." "What if a kid wants a popsicle? I cannot just drive away from a hungry child, Molly. I have standards." "We are professional criminals, Bill, not confectioners! We do not stop for popsicles!" "I am just saying, a little bribery goes a long way with a witness who cannot even tie their own shoes." "Fine. Bring a bag of fudge pops. But no distractions. We enter at 3:05, clear the perimeter by 3:12, and we are out by 3:15. And remember, the dog is a Golden Retriever. Just throw a tennis ball in the opposite direction." "What if the dog likes me more than the ball?" "Then you use the dog as a shield! Now, do I look like a professional coordinator or a desperate party planner?" "You look like you are about to commit a felony." "Perfect. Now, start practicing your sad clown face." "I do not have a sad face," Bill groaned. "I have a confused face." "Same thing in this neighborhood," Molly sighed. "Now, put on the oversized shoes before I decide to do this with a more competent clown."
created_at
"2026-05-10T13:06:40.529Z"
5/10/2026, 1:06:40 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
153
moderation_latency
(null)
generation_time
73080
tokens_prompt
77
tokens_completion
2416
native_tokens_prompt
82
native_tokens_completion
2509
native_tokens_completion_images
(null)
native_tokens_reasoning
1726
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00096408
router
(null)
provider_responses
0
endpoint_id
"9d4644e9-a5e5-41f4-b810-d75f484d2c24"
is_byok
false
latency
19
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Parasail"
status
429
1
endpoint_id
"576fca5b-f09c-4bda-b560-9989c39ac3ab"
is_byok
false
latency
10077
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Together"
status
504
2
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-R7TfVqA0LBnhxHKEuwIxDfJm"
is_byok
false
latency
53
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778418400-XStbfrJkmSWkinBp80Nx"
session_id
(null)
api_type
"completions"
id
"gen-1778418400-o23s1JvL6Uvo117UVCCa"
upstream_id
"chatcmpl-R7TfVqA0LBnhxHKEuwIxDfJm"
total_cost
0.00096408
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.03%
Matches word count
n/a
neededClean
false
words
470
0.00%
Dialogue to Total Word Ratio
Ratio: 94.94%, Deviation: 24.94%
neededClean
false
wordsTotal
474
wordsDialogue
450
0.0152%