NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8ZBQ81FRE6ZPD0M60QH2YW
Run Details
Model:
Gemma 4 26B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
3m 35s
Cost:
$0.004768
Provider:
SiliconFlow
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
99.08%
Dialogue to Total Word Ratio
(details)
99.5383%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tapped the crinkled map on the table. "Listen, Bill. The target is the Miller estate. Saturday, two p.m. High density of brightly wrapped contraband." "Is it the diamonds again, Molly? My nerves can't take another high-security vault job." "Better. It's a seventh birthday party. The loot is concentrated in the living room, tucked under the weeping willow tree." "Presents? You want us to hit a bouncy castle?" "Don't underestimate the prize, Bill. We're talking LEGO Technic sets, remote-controlled drones, and a potentially mint-condition Nintendo Switch." "What's the extraction plan? A getaway car?" "A minivan. We enter via the sliding glass door. I'll provide cover by pretending to be a disgruntled birthday clown. You slip in, bag the goods, and exit through the sandbox." "A clown? I don't do face paint, Molly. It's terrible for my complexion." "It's a disguise! Now, what about the security?" "The security?" "The toddlers. They're unpredictable. They have sticky hands and zero impulse control." "I'll bring the juice boxes as a distraction." "Perfect. Now, go get the oversized shoes and the red nose." Bill stared at her, completely skeptical. "Do we need the glitter cannon?" "Absolutely. It's the only way to ensure a clean getaway."
created_at
"2026-05-10T12:56:26.12Z"
5/10/2026, 12:56:26 PM
model
"google/gemma-4-26b-a4b-it-20260403"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2582
moderation_latency
(null)
generation_time
214961
tokens_prompt
77
tokens_completion
9401
native_tokens_prompt
82
native_tokens_completion
11895
native_tokens_completion_images
(null)
native_tokens_reasoning
9076
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.00476784
router
(null)
provider_responses
0
endpoint_id
"eb382d64-dce1-42d5-b23d-6fefb91948a3"
id
"chatcmpl-908cc26a60e78da2"
is_byok
false
latency
2553
model_permaslug
"google/gemma-4-26b-a4b-it-20260403"
provider_name
"SiliconFlow"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778417786-tfGisqhxqDrv6GxGHx3u"
session_id
(null)
api_type
"completions"
id
"gen-1778417786-a4EmUJFZPC3GOzvKSGJ0"
upstream_id
"chatcmpl-908cc26a60e78da2"
total_cost
0.00476784
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
99.08%
Dialogue to Total Word Ratio
Ratio: 93.10%, Deviation: 3.10%
neededClean
false
wordsTotal
203
wordsDialogue
189
99.5383%