NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR92K986ZGHEVS7ZH212VFWD
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
7m 54s
Cost:
$0.004470
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
97.9931%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crude blueprint of the suburban living room across the greasy dashboard of their getaway hatchback. She pointed a manicured finger at the designated Gift Zone, a strategic cluster of colorful boxes guarded only by a sleeping golden retriever and a single, exhausted parent. Bill chewed on a piece of stale licorice, his oversized tuxedo straining at the seams. He looked less like a professional infiltrator and more like a penguin suffering a mid-life crisis. The plan was simple: penetrate the perimeter during the cake-cutting chaos, swoop the loot, and exit via the sliding patio doors before the sugar crash hit. Molly checked her stopwatch, her eyes narrowing with predatory intensity. She viewed the upcoming chaos as a tactical battlefield. Poor Bill shifted nervously, accidentally knocking over a container of glitter that coated his shoes in shimmering pink dust. He looked down at his feet in horror, realizing he now sparkled like a disco ball. Molly sighed, rubbing her temples as she considered the sheer incompetence of her partner. She glared at him with cold, calculating, weary eyes. "Stop moving right now, Bill," she hissed. "But I am shimmering still," he replied. "It is a great distraction," she lied. "Do you really think so?" he asked.
created_at
"2026-05-10T13:52:59.661Z"
5/10/2026, 1:52:59 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
469
moderation_latency
(null)
generation_time
473555
tokens_prompt
77
tokens_completion
9115
native_tokens_prompt
82
native_tokens_completion
11735
native_tokens_completion_images
(null)
native_tokens_reasoning
8792
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00446996
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-RcUIIm8Uk7Ksc7YqMzVvNPRc"
is_byok
false
latency
51
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778421179-sX8PEPxLNv47m0iUEMuf"
session_id
(null)
api_type
"completions"
id
"gen-1778421179-Uf7u238eJPp7nhnCIm6N"
upstream_id
"chatcmpl-RcUIIm8Uk7Ksc7YqMzVvNPRc"
total_cost
0.00446996
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
208
100.00%
Dialogue to Total Word Ratio
Ratio: 9.52%, Deviation: 0.48%
neededClean
false
wordsTotal
210
wordsDialogue
20
97.9931%