NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-08-01KGYY5PAC7YYH71J0RN5PFKR2
Run Details
Model:
Z.AI GLM 4.6
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
56.5s
Cost:
$0.004172
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
93.28%
Dialogue to Total Word Ratio
(details)
68.3305%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
They huddled behind the minivan, the scent of ozone and stale goldfish crackers thick in the air. “Security is tight,” Bill whispered, adjusting his ski mask. “One parent on cake patrol, another guarding the present table like it’s Fort Knox.” “Bill, the ‘patrol’ is a dad wearing a cone hat. Our biggest threat is a toddler with a juice box and no regard for personal space.” “The piñata is our window. When it breaks, chaos ensues. That’s our move.” “So you’re suggesting I create a diversion by getting whacked with a stick by a seven-year-old named Timmy?” “It’s a tactical sacrifice for the loot. I need the big box, the one with the drone.” “You can’t just grab the biggest present! It’s the first thing they’ll notice. We grab a few gift bags, blend in, and walk out calmly.” “Amateur hour. We need a signature. A calling card.” “A calling card? What are we going to do, leave a ransom note written in crayon?” “No,” Bill said, his eyes gleaming. “We go in loud. We sing ‘Happy Birthday.’ It’s the ultimate auditory smokescreen.”
created_at
"2026-02-08T15:34:19.928426+00:00"
2/8/2026, 3:34:19 PM
model
"z-ai/glm-4.6"
4/6/2001, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
241
moderation_latency
(null)
generation_time
56364
tokens_prompt
61
tokens_completion
2394
native_tokens_prompt
66
native_tokens_completion
2394
native_tokens_completion_images
(null)
native_tokens_reasoning
2193
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0041715399999552
router
(null)
provider_responses
0
id
"chatcmpl-RfN3IjLbUB7B14l4hGKDaqYr"
status
200
is_byok
false
latency
72
endpoint_id
"c8607edc-8577-4199-b03d-1213da4743dc"
provider_name
"DeepInfra"
model_permaslug
"z-ai/glm-4.6"
4/6/2001, 12:00:00 AM
api_type
"completions"
id
"gen-1770564802-v0yJ9XuRsqEgwzFrvY7u"
upstream_id
"chatcmpl-RfN3IjLbUB7B14l4hGKDaqYr"
total_cost
0.0041715399999552
cache_discount
0.0000224000000448
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
183
93.28%
Dialogue to Total Word Ratio
Ratio: 84.86%, Deviation: 5.14%
neededClean
false
wordsTotal
185
wordsDialogue
157
68.3305%