NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR93A52TPWCSRGEHRAYMQJYP
Run Details
Model:
Gemma 4 31B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
15.9s
Cost:
$0.000112
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
18.3940%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"The perimeter is secure, Bill. I've neutralized the perimeter guard." "The Golden Retriever?" "He's distracted by a slice of pepperoni pizza. Now, the target." "The pile of presents in the living room?" "Exactly. We strike during the piñata chaos." "Why the piñata?" "Because when thirty sugar-crazed seven-year-olds charge a cardboard donkey, nobody notices two adults in oversized dinosaur costumes sneaking toward the gift table." "I'm not wearing the T-Rex suit, Molly. It chafes." "You're the Stegosaurus, Bill. Just deal with it." "Is the getaway vehicle ready?" "The ice cream truck is idling in the driveway. We load the loot, hit the siren, and vanish into the suburbs." "What about the cake?" "Bill, we are professional thieves, not gluttons." "It's a triple-layer chocolate fudge cake." "Fine. We take the cake." "I knew I liked you." "Shut up and get in the costume." "Do I get a hat?" "No." "Typical." "Ready? Three, two, one... Rawr!"
created_at
"2026-05-10T14:05:36.6Z"
5/10/2026, 2:05:36 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
231
moderation_latency
(null)
generation_time
8247
tokens_prompt
77
tokens_completion
274
native_tokens_prompt
79
native_tokens_completion
268
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00011211
router
(null)
provider_responses
0
endpoint_id
"4149ab03-3b69-45a0-b94c-6c253b8641a1"
id
"chatcmpl-R2GYcSDQlO5MvsktDTJGwt6m"
is_byok
false
latency
68
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778421936-nFFRtnezsqcjJS3N3TSI"
session_id
(null)
api_type
"completions"
id
"gen-1778421936-RfycQaBWiFsC9R4tTJBt"
upstream_id
"chatcmpl-R2GYcSDQlO5MvsktDTJGwt6m"
total_cost
0.00011211
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
153
36.79%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
157
wordsDialogue
157
18.3940%