NC Bench
Overview
About
Tests
Models
Model
Z.AI GLM 4.6
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYZ585G49H06RRB2YBK38C2
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled of stale coffee and regret. Molly peered through binoculars at the two-story house across the street, where a brightly coloured bouncy castle shuddered like a giant, wounded jellyfish. The sound of distant, shrieking children leaked through the closed windows. “The target is a soft touch,” she said, lowering the binoculars. “Security is minimal—one distracted mother and a dad manning the barbecue. The primary asset, a mountain of presents, is located in the living room, clearly visible through the bay window.” Bill, who was attempting to balance a sugar packet on its edge, looked up. “The big one with the dinosaur wrapping paper? I bet that’s a model rocket.” “Focus, Bill,” Molly snapped, snatching the binoculars back. “We’re not window-shopping, we’re executing. Phase one: infiltration. I’ve procured disguises.” “Disguises? For a kids’ party? What are we going as, the weird, childless neighbours?” “Better,” Molly said, a grim smile playing on her lips. “Clowns.” Bill’s face fell. “Clowns? Molly, no. Absolutely not. I have a thing about clowns. And small children. They’re sticky and they ask too many questions.” “It’s perfect camouflage,” she insisted. “We blend in. We bring a giant, fake gift, say we’re ‘Bippo and Zippo,’ and entertain the little blighters for five minutes. Then, during the chaos of the ‘Pin the Tail on the Donkey’ game, we execute the grab.” “My nose will honk if I breathe too hard,” Bill whined. “People will try to pull on my oversized tie. I saw a documentary about the psychological toll of clowning.” “It’s a calculated risk for a major score. Think of the resale value on a pristine, unopened ‘Giga-Rex T-Rex.’ We’re talking triple-digit profits, Bill.” He sighed, slumping in his seat. “Fine. But do we at least get any cake?” Molly ignored him, her gaze fixed on the house. “Phase two begins at 3:15 p.m. That’s when the ice cream truck arrives. Maximum distraction.”
created_at
"2026-02-08T15:51:40.345262+00:00"
Sunday, February 8th, 2026 at 3:51:40 PM GMT+00:00
model
"z-ai/glm-4.6"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
429
moderation_latency
(null)
generation_time
62648
tokens_prompt
61
tokens_completion
1914
native_tokens_prompt
66
native_tokens_completion
1924
native_tokens_completion_images
(null)
native_tokens_reasoning
1491
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0033537399999552
router
(null)
provider_responses
0
id
"chatcmpl-RAWO3SETudbedk8YqVnhchbw"
status
200
is_byok
false
latency
128
endpoint_id
"c8607edc-8577-4199-b03d-1213da4743dc"
provider_name
"DeepInfra"
model_permaslug
"z-ai/glm-4.6"
api_type
"completions"
id
"gen-1770565836-AjQBPge1dm6pdVIP6Bb8"
upstream_id
"chatcmpl-RAWO3SETudbedk8YqVnhchbw"
total_cost
0.0033537399999552
cache_discount
0.0000224000000448
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
321
0.0000%
Dialogue to Total Word Ratio
Ratio: 69.16%, Deviation: 19.16%
neededClean
false
wordsTotal
321
wordsDialogue
222
0.0000%