NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZ312XBGT7EY2RZ74XFTK2
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
4.6s
Cost:
$0.004178
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
2.01%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
1.0058%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her ridiculous curly wig in the rearview mirror of the beat-up van, peering at Bill who was trying to wedge himself into a too-small clown suit. "This is the dumbest idea you've ever had, and you've had some winners, Bill. Breaking into a children's party? We're not stealing candy from babies, we're stealing from the actual babies' birthday party." Bill grunted, yanking the polka-dot fabric over his belly. "It's genius, Mol. Think about it. Twenty screaming kids, one distracted mom, and a mountain of brightly wrapped presents just sitting there like idiots. No security, no alarms, just juice boxes and regret. We go in as entertainment, cause a little chaos, and walk out with enough toys to fence for six months." "You're suggesting we ruin a six-year-old's birthday," Molly said, poking him in the chest. "I steal from banks, not from Timmy's Hot Wheels collection. What's next, we mug the tooth fairy?" "Timmy's getting a Nintendo Switch in that pile, I saw the gift table through the window. You know how much those go for on the street? And don't give me that moral high ground crap. Last month you swiped the donation box from a puppy shelter." "That was different," Molly snapped. "Those puppies had suspicious eyes. Besides, I donated half the money to an actual animal rescue. What are you going to do with a bunch of plastic dinosaurs and a Frozen doll? Start a black market for kindergarteners?" Bill zipped up the clown suit, the seams screaming in protest. "We create a diversion. I'll juggle some balls, maybe accidentally set off the piƱata early. You slip in as the face-painting lady, start drawing moustaches on all the kids while I bag the goods. Simple. Elegant. Criminal." "Elegant? You look like a circus exploded on a garbage truck. And what if one of the kids recognizes us? 'Mommy, that clown looks like the man from America's Most Wanted!'" "Then we give them extra candy and tell them it's magic disappearing powder. Kids are gullible. Remember when we stole that diamond from the museum by pretending it was part of the interactive exhibit? This is the same principle, just smaller and with more snot." Molly sighed, checking her watch. "The party's starting in twenty minutes. If we're doing this, we do it right. No clown noses on during the actual heist, you'll leave DNA everywhere. And I'm not singing Happy Birthday. I have standards." "Standards," Bill chuckled, adjusting his red rubber nose. "Said the woman who once hid in a laundry cart for three hours to steal a CEO's underwear." "That was for blackmail purposes, completely different. Fine. We hit the party, grab the presents, and if anyone asks, we're from 'Magical Moments Entertainment' and the real clowns are stuck in traffic. But if we get recognized by a bunch of sticky-fingered six-year-olds, I'm throwing you under the bus. Or the bouncy castle. Whichever comes first." Bill grinned, offering her a high-five that she reluctantly returned. "That's my girl. Now let's go ruin some childhoods and make some serious cash." "God, I hate you sometimes." "You love me. Pass the balloon animals."
created_at
"2026-03-12T21:26:22.05Z"
3/12/2026, 9:26:22 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
338
moderation_latency
(null)
generation_time
4399
tokens_prompt
77
tokens_completion
802
native_tokens_prompt
182
native_tokens_completion
674
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0041776
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"edfb9436-0ffc-b091-f5a0-e8c4b886b761"
is_byok
false
latency
299
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350782-mbd5Qo6Quugh9dOr3yzy"
upstream_id
"edfb9436-0ffc-b091-f5a0-e8c4b886b761"
total_cost
0.0041776
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
2.01%
Matches word count
n/a
neededClean
false
words
525
0.00%
Dialogue to Total Word Ratio
Ratio: 84.49%, Deviation: 34.49%
neededClean
false
wordsTotal
535
wordsDialogue
452
1.0058%