NC Bench
Overview
About
Tests
Models
Model
Z.AI GLM 4.7 Flash
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYXNGK5VARS2Z51A74XXSJG
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Have you got the map?" Molly asked, shaking a crumpled sheet of glossy cardstock at him. "I need to know exactly where to plant the explosives." Bill squinted. "The only thing on this is a picture of a cat. Is the cat the interior point?" "It's a roadmap, Bill. A masterpiece of evasion. Look here, the main gate is guarded by the terrifying beast known as a nanny. Do we have the shark repellent?" "I put it in the glove box. Don't ask about the glove box, just trust me. What about the loot?" "The loot is heavy. The Bennetts got a second table just for wrapping paper. It’s almost as big as the Christmas one." "That’s wasteful. That’s what we’re here to correct. We aren’t just stealing presents, Bill; we are correcting fiscal negligence." "Exactly. Listen, the getaway plan. Do we go through the back?" "No. Too much hedge. I mowed that hedge yesterday and I’m still finding ticks in my ears. We need a vehicle that can handle the terrain." "A van?" "Too conspicuous. A van is just a moving crime scene. We need something... airy. Something that blends." "A hot air balloon?" "Too much wind. Besides, I get carsick. What about a bicycle?" "With two people? You’ll snap the frame." "No. A tricycle. The children’s tricycle. Think about it! We strap the bags to the seat, we pedal into the sunset, whispering sinister lullabies to the passing parents. They’ll think it’s a diaper service strike." "You’re a menace." "I’m an artist of chaos. Now put on the spiderman pajamas. The blue ones. You look like a bruised plum." "These are the red ones. And I refuse to wear a cape. It’ll snag on the briars." "Just zip your jacket over it, you fur-ball. The wig goes on last. We need you to look like a magic clown who ate a balloon animal." "I’d rather die. That wig itches." "Just like a real clown, darling. Now, stand by the window. I’m checking for security cameras. If you see that blue-haired lady from the school board, duck. We don’t need a hostage situation." "What’s the signal?" "Whistle. The tune from Jaws." "Are you serious? I’ll attract the cops immediately." "I’m serious. Now, do you have the burlap sacks?" "Check." "The bandanas?" "Check." "The bolt cutters?" "Check." "Excellent. Once the 'magician' starts the balloon animals, we strike. I want you to focus on the sparkling stuff. The make-up, the iPhone, the gold coins." "What about the Lego?" "The Lego stays. It’s cubic and unappealing. We only want the shine." "You’re a sick man, Molly." "I’m a thief with a vision. Let’s go turn a birthday party into a crime wave."
created_at
"2026-02-08T15:25:42.785972+00:00"
Sunday, February 8th, 2026 at 3:25:42 PM GMT+00:00
model
"z-ai/glm-4.7-flash-20260119"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
181
moderation_latency
(null)
generation_time
69168
tokens_prompt
61
tokens_completion
2384
native_tokens_prompt
66
native_tokens_completion
2393
native_tokens_completion_images
(null)
native_tokens_reasoning
1763
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0009579600000128
router
(null)
provider_responses
0
id
"chatcmpl-R7D3cHYpivW8HGz4IzteMBmT"
status
200
is_byok
false
latency
60
endpoint_id
"6efb5994-8fcb-4300-9c64-ba393193f5f6"
provider_name
"DeepInfra"
model_permaslug
"z-ai/glm-4.7-flash-20260119"
api_type
"completions"
id
"gen-1770564272-cY3BDU4Vy2BeDxHjcBGO"
upstream_id
"chatcmpl-R7D3cHYpivW8HGz4IzteMBmT"
total_cost
0.0009579600000128
cache_discount
0.0000031999999872
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
453
0.0000%
Dialogue to Total Word Ratio
Ratio: 97.13%, Deviation: 27.13%
neededClean
false
wordsTotal
453
wordsDialogue
440
0.0000%