NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHYRKW7379BZ4EW3K2CX07N
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4.0s
Cost:
$0.003753
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her oversized sunglasses and peered through the bakery window at the colorful chaos inside the community hall. "This is the dumbest idea you've ever had, Bill. And you've had some real winners." Bill grinned, rubbing his hands together like a cartoon villain with arthritis. "Come on, Mol. Think about it. Twenty screaming kids, one distracted mom in a paper crown, and a mountain of brightly wrapped presents just sitting there like they're begging to be liberated. It's practically charity work." "Charity? For who, exactly? The Bank of Bill's Gambling Debts?" Molly hissed, pulling him behind a parked minivan covered in soccer ball stickers. "We're not stealing from children. Even I have standards." "They're not using the presents yet," Bill whispered, his voice taking on that wheedling tone that always meant trouble. "The party's just starting. We slip in during musical chairs, create a diversion with the piƱata, and poof. In and out. Like ninjas. Wealthy ninjas." Molly stared at him. "Ninjas who steal Barbie dreamhouses and plastic dinosaurs. We're hitting rock bottom here." "Rock bottom has excellent parking," Bill countered. "Look, that clown they hired looks about two minutes from a nervous breakdown. We wait for him to start his balloon animal routine, then I accidentally knock over the punch bowl. Total pandemonium. You grab the gift table while everyone's slipping on fruit punch and crying." "You're suggesting we ruin some poor kid's birthday for a pile of plastic junk and gift cards to Build-A-Bear." Molly pinched the bridge of her nose. "My mother would be so proud." "Your mother once stole the collection plate at church," Bill reminded her. "This is practically respectable by comparison. Besides, we're not keeping everything. We'll donate the boring educational toys to charity. I'm not a monster." Molly looked at him for a long moment, then sighed the sigh of someone who knew she was going to regret this until her dying day. "Fine. But if we get recognized by a bunch of six-year-olds, I'm throwing you under the bus. Literally. I'll tell them you hate ponies." Bill's face lit up like he'd just won the lottery. "That's the spirit! Okay, phase one: acquire disguises. I was thinking parent clothes. You can be the stressed mom with the wine headache, I'll be the dad who's pretending to care about this crap." "Bill." "Yeah?" "We're burning in hell for this." "Probably," he admitted cheerfully, linking his arm through hers as they headed toward the side entrance. "But at least we'll have a new Xbox. Happy early Christmas to us." Molly muttered something that sounded suspiciously like a prayer for forgiveness as they slipped through the door, the distant sounds of "Happy Birthday" already echoing from inside like a warning they had no intention of heeding.
created_at
"2026-03-12T21:20:41.125Z"
3/12/2026, 9:20:41 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
484
moderation_latency
(null)
generation_time
3133
tokens_prompt
77
tokens_completion
731
native_tokens_prompt
182
native_tokens_completion
584
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0037528
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"9ed079e4-ee49-5de2-923b-00fb477abf8f"
is_byok
false
latency
479
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350441-TDSe1oPp7VJDUB0vfEdT"
upstream_id
"9ed079e4-ee49-5de2-923b-00fb477abf8f"
total_cost
0.0037528
cache_discount
0.0001152
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
462
0.00%
Dialogue to Total Word Ratio
Ratio: 66.09%, Deviation: 36.09%
neededClean
false
wordsTotal
466
wordsDialogue
308
0.0000%