NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8Z2ZN9Y560BJZ5ADJ6HB1B
Run Details
Model:
Gemma 4 26B
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
7.7s
Cost:
$0.000108
Provider:
SiliconFlow
Evaluation summary
Score
Evaluator
Details Link
35.00%
Matches word count
(details)
92.17%
Dialogue to Total Word Ratio
(details)
63.5853%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly squinted through the high-powered binoculars, her brow furrowed at the sheer amount of primary-colored chaos unfolding in the backyard. A battalion of toddlers, fueled by pure glucose and questionable parenting, swarmed the bouncy castle like tiny, sticky locusts. Beside her, Bill adjusted his tactical turtleneck, which was unfortunately too tight and made him look like a very nervous grape. He clutched a blueprint of the suburban bungalow as if it were the Magna Carta. The target was clear: the mountain of brightly wrapped boxes sitting precariously near the snack table. This wasn't just a heist; it was an extraction of high-value plastic goods. "Target confirmed," Molly whispered, her voice taut with professional intensity. "The glitter-bomb zone is heavily guarded by a three-year-old in a dinosaur suit." "Do we have the extraction vehicle ready?" Bill asked, sweating profusely. "The minivan is idling," she replied. They checked their gear one last time, ensuring their suction cups and silent step-boots were primed for the mission. One wrong move and they’d be neutralized by a sudden tantrum or, worse, a handful of sticky finger paints.
created_at
"2026-05-10T12:51:39.825Z"
5/10/2026, 12:51:39 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
5368
moderation_latency
(null)
generation_time
7057
tokens_prompt
77
tokens_completion
299
native_tokens_prompt
79
native_tokens_completion
246
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00010788
router
(null)
provider_responses
0
endpoint_id
"eb382d64-dce1-42d5-b23d-6fefb91948a3"
id
"chatcmpl-b33e57c3e799a9d8"
is_byok
false
latency
5366
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"SiliconFlow"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778417499-hMocB697cRj2FfnAj36m"
session_id
(null)
api_type
"completions"
id
"gen-1778417499-ba9g1f0TVJ8hCHUQoCrJ"
upstream_id
"chatcmpl-b33e57c3e799a9d8"
total_cost
0.00010788
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
35.00%
Matches word count
n/a
neededClean
false
words
182
92.17%
Dialogue to Total Word Ratio
Ratio: 15.34%, Deviation: 5.34%
neededClean
false
wordsTotal
189
wordsDialogue
29
63.5853%