NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0490AZF4GZ8FVJ1AJDBVDM
Run Details
Model:
Z.AI GLM 5.1
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
27.6s
Cost:
$0.002674
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.07%
Dialogue to Total Word Ratio
(details)
40.6694%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly leaned back in the vinyl booth and slid a crumpled flyer across the table. Bill picked it up, squinting at the sparkly font. "Madison's Magical Unicorn Spectacular?" Bill read. "What is this?" "That, my friend, is our next score." "You want to steal a unicorn?" "No, you absolute walnut. The presents. It's a children's party. Rich kids, rich parents, rich gifts." Bill scratched his chin. "So we're robbing children now?" "We're reallocating surplus from people who still believe in the tooth fairy." "That's dark, Molly." "That's practical. You see the guest list? The Hendersons alone bought a two-hundred-dollar Lego set. The Chens got that remote control dinosaur. We're talking thousands in unwrapped merchandise." "And how exactly do we get in? We don't exactly look like we belong at a unicorn spectacular." Molly pulled two headbands from her bag, each topped with a sparkly horn. "We go as entertainers. The princess and the wizard. I booked us this morning." "You booked us? As what?" "As performers. You're doing balloon animals. I'm doing face painting." "I don't know how to make balloon animals." "You're a grown man with two hands. How hard can it be?" "Extremely hard! Have you seen those things? They pop. They squeak. Children cry." "Children cry anyway. That's kind of their whole deal." Bill stared at the flyer again. "What's the extraction plan?" "Simple. We perform, we mingle, we identify the gift table, and during the grand finale, which I'm told involves a real live pony, every parent will be distracted taking photos. That's when we load the presents through the kitchen into my van." "Your van has a unicorn mural on the side." "Exactly. Nobody questions a van covered in rainbows at a children's party." Bill drummed his fingers on the table. "What if the pony situation goes sideways?" "Then we improvise." "I hate improv." "You'll love this. Trust me. The parents are going to be so busy arguing about whose kid got more attention from the princess that we could walk out with the entire gift table and the birthday cake." "What about the kids? Won't they notice?" "Molly, they're five. They think the party is happening because they exist. They're not tracking inventory." "What about Madison?" "What about her?" "Isn't it wrong to make a child cry on her birthday?" Molly paused. For a moment, something flickered behind her eyes. Guilt, maybe. Or just gas from the diner coffee. "We'll leave one present. The biggest one. From the Hendersons." "The Lego set?" "The Lego set. Madison gets the Lego set, we get everything else. Everyone wins." "Except every other child at the party." "Collateral damage, Bill. Now put on your horn and start practicing your poodle." Bill placed the headband on his head. The sparkly horn listed sideways like a small, defeated ship. "I look ridiculous." "You look committed. That's what separates the amateurs from the pros." "I don't think pros rob birthday parties." "Then we're innovators. Now drink your coffee. Madison's party starts in four hours and you still think a balloon giraffe is advanced."
created_at
"2026-04-24T16:13:28.297Z"
4/24/2026, 4:13:28 PM
model
"z-ai/glm-5.1-20260406"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
579
moderation_latency
(null)
generation_time
27539
tokens_prompt
77
tokens_completion
904
native_tokens_prompt
66
native_tokens_completion
752
native_tokens_completion_images
(null)
native_tokens_reasoning
64
native_tokens_cached
32
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00267426
router
(null)
provider_responses
0
endpoint_id
"e888c44c-6a07-4a9e-bc56-cb26511865ec"
id
"chatcmpl-Re8PvEssezsiwlL1k3nqmMoT"
is_byok
false
latency
100
model_permaslug
"z-ai/glm-5.1-20260406"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777047208-Fj1QshlbAzDIOD7vLAt0"
session_id
(null)
api_type
"completions"
id
"gen-1777047208-s7Em10d60NxgYAsNhXgj"
upstream_id
"chatcmpl-Re8PvEssezsiwlL1k3nqmMoT"
total_cost
0.00267426
cache_discount
0.000027039
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
512
0.07%
Dialogue to Total Word Ratio
Ratio: 82.10%, Deviation: 12.10%
neededClean
false
wordsTotal
514
wordsDialogue
422
40.6694%