NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ023XTNXS4SEA1YE1V5DXR0
Run Details
Model:
Z.AI GLM 5.1
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 57s
Cost:
$0.023918
Provider:
Parasail
Evaluation summary
Score
Evaluator
Details Link
0.21%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.1070%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill tapped the crayon-drawn map. "The entry point is the bouncy castle." Molly rolled her eyes. "The bouncy castle. Of course. And how exactly do you propose we breach the inflatable fortress without waking the entire neighborhood?" "We go in barefoot. Sneakers squeak on the vinyl." "Barefoot? Bill, that sounds like a static electricity nightmare. We'll stick to the plastic like Velcro." "Not if we coat our feet in butter." "Butter. You want to butter our feet and hop into a bouncy castle." "It eliminates friction, Molly. It's basic physics." "It's basic insanity. What about the kids already in there? They're jumping around like caffeinated chimpanzees. One rogue kick to the jaw and you're out cold." "Which is exactly why we wait for the cake. When the cake comes out, every single kid evacuates the castle. It creates a temporary vacuum. We slip in the back entrance, bounce over the interior wall, and land cleanly in the hedges beside the gift table." "Slip being the operative word." "Trust me. Once we're past the castle perimeter, we hit the primary obstacle. The clown." Molly shuddered visibly. "Clowns are wildly unpredictable. Does he make balloon animals?" "He makes swords. Very aggressive swords." "Balloon swords, Bill. They are literal balloons." "Have you ever been whacked with a twisted balloon sword? It stings. I still have a welt on my shin from the Henderson birthday job." "Fine. I'll handle the clown. I'll bring a pair of sewing scissors. Pop his sword, pop his fragile ego." "Brutal. I like it. While you neutralize the clown threat, I'll be at the gift table. We need a systematic approach to the presents. We sort them immediately by weight and sound." "Sound?" "If it rattles, it's Lego. We immediately dump the Lego. Too much heat, impossible to fence on the secondary market. If it's heavy, it's probably a book. Dump the books." "Kids don't want books anyway. What about the soft, light packages?" "Those are the wildcards. Could be cash in a card. Could be socks. We take the gamble." "What about the piƱata?" "What about it?" "It's a blindfolded child swinging a wooden baseball bat at a suspended paper donkey. That's a severe hazard to our exit route." "I'll wear a helmet." "You're going to wear a helmet to a five-year-old's birthday party." "Concealed under an oversized party hat, obviously. Listen, Molly, this isn't my first rodeo. We get in, we grab the Polly Pockets, we get out before the sugar crash hits." "If we get trapped in the ball pit, I'm leaving you behind." "Nobody gets trapped in a ball pit. You just swim through it." "You can't swim in a ball pit, Bill. You sink. It's colorful quicksand." "We'll use the butter." "We are not buttering our entire bodies!" "Just the torso! Look, do you want the loot or not? I hear the birthday boy is getting a motorized Power Wheels Jeep." Molly paused, her fingers hovering over the blueprints. "Power Wheels?" "Four wheels, battery powered, chrome accents." "Fine. But I'm bringing a flare gun for the ball pit." "Understood. Now, let's talk about our camouflage. Do you prefer pink taffeta or purple?"
created_at
"2026-04-24T15:35:44.734Z"
4/24/2026, 3:35:44 PM
model
"z-ai/glm-5.1-20260406"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
392
moderation_latency
(null)
generation_time
177312
tokens_prompt
77
tokens_completion
5829
native_tokens_prompt
66
native_tokens_completion
5415
native_tokens_completion_images
(null)
native_tokens_reasoning
5023
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0239184
router
(null)
provider_responses
0
endpoint_id
"9b3b28f6-5691-483e-946a-c99728f61737"
id
"852d2d6973d94b29bae0d93a93f450d6"
is_byok
false
latency
391
model_permaslug
"z-ai/glm-5.1-20260406"
provider_name
"Parasail"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777044944-HBAaguYbdbwfB0w4gKpd"
session_id
(null)
api_type
"completions"
id
"gen-1777044944-NR7qJY58mc29G6w3V9U3"
upstream_id
"852d2d6973d94b29bae0d93a93f450d6"
total_cost
0.0239184
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Parasail"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.21%
Matches word count
n/a
neededClean
false
words
528
0.00%
Dialogue to Total Word Ratio
Ratio: 96.05%, Deviation: 26.05%
neededClean
false
wordsTotal
531
wordsDialogue
510
0.1070%