NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR90S2BWC0M4W63CMGBEYH1R
Run Details
Model:
Gemma 4 31B (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
30.2s
Cost:
$0.001107
Provider:
Together
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread a piece of paper across the grease-stained table. Look at the blueprint, Bill. It's a crayon drawing on a napkin, Molly. It's a tactical sketch. Now, the entry point is the patio. We slip in during the Pin the Tail on the Donkey phase. Total chaos. Zero visibility. Why do we have to be so stealthy? It's a bunch of first-graders. Amateurs. First-graders are the perfect informants. One missing toy and they'll scream loud enough to alert the National Guard. We need surgical precision. I'm still not wearing the tutu. You're the Fairy Godmother, Bill. It's a masterclass in psychological camouflage. I'm six-foot-four and I have a beard that could house a family of squirrels. That’s the brilliance of it! You’re a subversive, avant-garde fairy. You'll blend right in with the modern aesthetic of suburban parenting. I'll blend in like a rhinoceros in a porcelain shop. Stop whining and look at the target. The Gift Table. It's the epicenter. We've got a high probability of LEGO Star Wars sets and at least two tablets. Maybe a luxury dollhouse. What's the extraction plan? The getaway tricycle is parked by the hydrangea bush. We load the loot, pedal like hell, and meet the van at the end of the cul-de-sac. A tricycle? I'm not fitting on a tricycle. Then you'll run! Just keep the tutu fluttering in the wind to distract the pursuers. Pursuers? You mean the parents? Exactly. While they're arguing over who forgot the napkins, we'll be halfway to the fence. What if there's cake? Bill, focus. I'm just saying, if it's a three-tier red velvet, the risk-to-reward ratio changes. I might be tempted to stay for a slice. The reward is five thousand dollars in resale value, you glutton! Fine. But I want a piece of cake. Put it in the contract. There is no contract! This is a heist, not a corporate merger. I just feel like my professional boundaries aren't being respected. Your boundary is the edge of that tutu. Now, practice your sparkle-dust throw. I can't. Glitter gets in my eyes. That's why we're using industrial-grade shimmer. It's more intimidating. Intimidating? We're robbing six-year-olds! Have you ever seen a toddler on a sugar rush, Bill? It's like dealing with a pack of hyenas in pajamas. Now, put on the wings.
created_at
"2026-05-10T13:21:12.068Z"
5/10/2026, 1:21:12 PM
model
"google/gemma-4-31b-it-20260402"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
332
moderation_latency
(null)
generation_time
29987
tokens_prompt
77
tokens_completion
2027
native_tokens_prompt
81
native_tokens_completion
2181
native_tokens_completion_images
(null)
native_tokens_reasoning
1415
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.0011067
router
(null)
provider_responses
0
endpoint_id
"576fca5b-f09c-4bda-b560-9989c39ac3ab"
id
"oiXpdSt-zfFMX-9f993ee290e8dbe1"
is_byok
false
latency
301
model_permaslug
"google/gemma-4-31b-it-20260402"
provider_name
"Together"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778419272-kOEBB6qFmEKapQ0pznOL"
session_id
(null)
api_type
"completions"
id
"gen-1778419272-kwfxez30Ef5HWJKFHDEN"
upstream_id
"oiXpdSt-zfFMX-9f993ee290e8dbe1"
total_cost
0.0011067
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Together"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
386
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
401
wordsDialogue
0
0.0000%