NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR91KSXNBYTMS0E3X5W8W7VW
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
16.7s
Cost:
$0.000953
Provider:
Venice
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
67.66%
Dialogue to Total Word Ratio
(details)
74.4676%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Okay Bill, the target is a seven-year-old's birthday bash," Molly whispered. "Why are we stealing from children?" Bill asked. "Low security, high volume, Bill! Look at this blueprint." "Is that a drawing in crayon?" "It's a schematic. We enter through the sliding door at 2 PM, during the piƱata distraction." "What if we're spotted?" "We'll be in disguise." "What kind of disguise?" "Giant inflatable dinosaurs." "I am not wearing a dinosaur suit, Molly." "It's a T-Rex! It provides tactical coverage and psychological intimidation!" "It's a toddler's party. They won't be intimidated; they'll want to hug me." "Exactly. While they're mesmerized by your tiny arms, I'll bag the LEGO sets and the remote-control cars." "What about the cake?" "Focus! We are professionals." "I just think a slice of cake is a fair commission for wearing a lizard suit." "Fine. One slice. Now, get in the costume." "Do I get to roar?" "Yes, Bill. You can roar. Just don't let the clown see you." "Is the clown an asset?" "The clown is our primary competition." "Does he have a dinosaur suit too?" "He has a balloon animal, Bill. He's dangerous."
created_at
"2026-05-10T13:35:48.156Z"
5/10/2026, 1:35:48 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
358
moderation_latency
(null)
generation_time
15913
tokens_prompt
77
tokens_completion
1754
native_tokens_prompt
82
native_tokens_completion
1877
native_tokens_completion_images
(null)
native_tokens_reasoning
1455
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00095285
router
(null)
provider_responses
0
endpoint_id
"29b94af5-d8f8-419a-9cf1-ae1f2831142b"
id
"chatcmpl-a537998211e643fa"
is_byok
false
latency
341
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Venice"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778420148-9K9bzzb9JamiPf1fP5iI"
session_id
(null)
api_type
"completions"
id
"gen-1778420148-2eIvFuq5b7j5CGwE44n8"
upstream_id
"chatcmpl-a537998211e643fa"
total_cost
0.00095285
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Venice"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
188
67.66%
Dialogue to Total Word Ratio
Ratio: 97.91%, Deviation: 7.91%
neededClean
false
wordsTotal
191
wordsDialogue
187
74.4676%