NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR91Y8NR96BHKR762BY5BCAM
Run Details
Model:
Gemma 4 31B
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
9.7s
Cost:
$0.000101
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
84.63%
Dialogue to Total Word Ratio
(details)
90.3072%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly sprawled across the blueprint of the suburban living room, her finger tracing the perilous route past the juice box station. Bill stood beside her, nervously adjusting a disguise that consisted entirely of an oversized neon yellow tutu and a glittery tiara. He looked less like a mastermind and more like a confused disco ball. The target was the pile of wrapped boxes currently guarded by a six-year-old with a sugar rush and a plastic sword. They had calculated the window of opportunity precisely between the cake slicing and the inevitable temper tantrum. Molly checked her watch, noting the exact moment the bouncy castle would reach maximum capacity, creating a perfect sonic distraction. Bill shifted uncomfortably, the tulle scratching his thighs as he contemplated the sheer indignity of their tactical approach. He leaned over the map, pointing toward the sliding glass door. "Is the tutu necessary?" Bill whispered. Molly didn't look up from her calculations, her eyes narrowing as she plotted the escape route through the hydrangea bushes. "Blend in, Bill," she hissed. He sighed, resigning himself to the glittery fate that awaited them in the depths of a themed birthday bash.
created_at
"2026-05-10T13:41:30.943Z"
5/10/2026, 1:41:30 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1194
moderation_latency
(null)
generation_time
9367
tokens_prompt
77
tokens_completion
306
native_tokens_prompt
79
native_tokens_completion
238
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00010071
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-RkjD6VBftOTgoIJmirq7gwsx"
is_byok
false
latency
1145
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778420491-U5aXiK09IhwXkKDZF1iV"
session_id
(null)
api_type
"completions"
id
"gen-1778420491-0f6c5dfZ0q623AV6ehRX"
upstream_id
"chatcmpl-RkjD6VBftOTgoIJmirq7gwsx"
total_cost
0.00010071
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
192
84.63%
Dialogue to Total Word Ratio
Ratio: 3.61%, Deviation: 6.39%
neededClean
false
wordsTotal
194
wordsDialogue
7
90.3072%