NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8XE94AZ6PQV1M1VDR7037Y
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
2m 15s
Cost:
$0.002194
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
94.64%
Dialogue to Total Word Ratio
(details)
95.3120%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Listen closely, Bill, because the window of opportunity is incredibly narrow," Molly hissed. "Is it the diamond vault? The central bank?" Bill whispered back, his eyes wide with anticipation. "Worse. It is Timmy’s seventh birthday party." "The target is... stuffed dinosaurs and building blocks?" "Exactly. We hit the gift pile at two o'clock sharp. Avoid the bouncy castle; it is a high-mobility zone with zero visibility." "What about the juice box perimeter? They are incredibly sticky, Molly." "Distractions, Bill! Stay focused. We grab the wrapped goods and vanish before the cake cutting commences." "And the clown? Is he an undercover mercenary?" "He is a wildcard, Bill. If he spots us, deploy the glitter bomb immediately." "I love the glitter bomb. Can I use the confetti cannons too?" "No! This is a surgical strike, not a parade. We enter via the sandbox, bypass the nap-time security, and exit through the hedge." "Sand in my boots, loot in my bag. I am ready." "Check your tactical diaper bag. We move at noon." "Copy that, Boss. Let's get that LEGO." "And Bill?" "Yes, Molly?" "No finger painting. It leaves evidence." "Copy that. No finger painting."
created_at
"2026-05-10T12:22:52.819Z"
5/10/2026, 12:22:52 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
250
moderation_latency
(null)
generation_time
135044
tokens_prompt
77
tokens_completion
5197
native_tokens_prompt
82
native_tokens_completion
6435
native_tokens_completion_images
(null)
native_tokens_reasoning
4891
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00219364
router
(null)
provider_responses
0
endpoint_id
"d71c6981-347d-4622-a86b-2c9e5d0f2c6c"
id
"chatcmpl-RX3iYFESfRdbMkR6Eb6Q8kQh"
is_byok
false
latency
49
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778415772-8jNrrfhNXbPlPegimCJW"
session_id
(null)
api_type
"completions"
id
"gen-1778415772-zYrF2znRsaNBSSPaE1yz"
upstream_id
"chatcmpl-RX3iYFESfRdbMkR6Eb6Q8kQh"
total_cost
0.00219364
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
192
94.64%
Dialogue to Total Word Ratio
Ratio: 94.85%, Deviation: 4.85%
neededClean
false
wordsTotal
194
wordsDialogue
184
95.3120%