NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZMH0T8RZBDME7HP3HRXP3
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3.9s
Cost:
$0.003866
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.6885%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her oversized clown wig and peered through the chain-link fence at the backyard chaos. A dozen sugar-high kids darted around a bouncy castle while a piñata shaped like a cartoon dinosaur swung lazily from a tree. "This is the dumbest job we've ever taken, Bill. We're about to ruin a six-year-old's birthday for a pile of plastic crap and gift cards." Bill crouched beside her, adjusting his fake mustache that refused to stay stuck. "Hey, the birthday boy's uncle is a tech billionaire. Those presents are loaded with the latest gadgets. We're talking unopened drones, VR headsets, limited-edition robot dogs. We fence that junk and we're set for months. Besides, you said you wanted something low-risk after the museum fiasco." "Low-risk? There are twenty screaming witnesses under four feet tall," Molly hissed, checking her watch. The party was in full swing. A mother in a unicorn onesie was leading a conga line. "And don't think I didn't notice you volunteered us for the clown disguises. You just wanted an excuse to wear those stupid shoes." "These shoes are professional," Bill said, kicking one giant red floppy shoe against the fence. It made a sad squeaking sound. "Plus, kids love clowns. We'll blend right in. I go in as Giggles the Clown, you come in as Bubbles. We do the magic trick with the disappearing handkerchiefs like we practiced, then while they're all distracted by the cake, we stuff the presents into the giant gift sack. Easy peasy." Molly rolled her eyes so hard it looked painful. "Your magic trick is literally just dropping the handkerchief on the ground and saying 'ta-da.' And what about the parents? One soccer mom with a phone and we're viral before we reach the getaway van." Bill grinned, pulling two colorful balloons from his pocket. "That's why we release the balloon animals at the exact moment. Chaos theory, babe. Kids go feral over balloon swords. We'll be in and out in seven minutes. Remember the signal? You yell 'Who wants ice cream?' and I hit the piñata with the hidden switchblade I taped to my wrist. Candy everywhere, total pandemonium." "You're going to pop a piñata with a switchblade at a children's party," Molly said flatly. "I'm starting to think the museum security guards were less insane than you." "Come on, Mol. Think of the look on that rich kid's face when he realizes his brand new gaming laptop is gone. We'll be legends in the thieving community. The crew that stole Christmas in July." Bill checked the fence gate. It was unlocked, as their inside source had promised. Molly sighed, adjusting her polka-dot bowtie. "Fine. But if we get recognized by some kindergartener with a good memory, I'm selling you out and starting a legitimate clown college. Let's go ruin some childhoods." They slipped through the gate, giant shoes squeaking in unison as they approached the oblivious party guests with matching criminal smiles plastered across their faces.
created_at
"2026-03-12T21:35:55.427Z"
3/12/2026, 9:35:55 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
324
moderation_latency
(null)
generation_time
3477
tokens_prompt
77
tokens_completion
753
native_tokens_prompt
182
native_tokens_completion
622
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0038656
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"d8e96108-271f-d1c8-5b08-ee14590e4604"
is_byok
false
latency
292
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351355-KoKcigWP7gwdsD43AvVf"
upstream_id
"d8e96108-271f-d1c8-5b08-ee14590e4604"
total_cost
0.0038656
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
495
0.00%
Dialogue to Total Word Ratio
Ratio: 68.45%, Deviation: 38.45%
neededClean
false
wordsTotal
504
wordsDialogue
345
49.6885%