NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHYSY42GWKT5JWXX1NDCTEM
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4.1s
Cost:
$0.004310
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
21.6892%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedge, peering at the colorful balloons bobbing above the backyard fence. “This is the dumbest idea you’ve ever had, Bill. We’re robbing a children’s party. There’s going to be clowns.” Bill adjusted his ski mask, which had a bright red pom-pom on the nose. “Exactly. Clowns are terrifying. Nobody’ll look twice at two more weirdos in costumes. We blend right in.” “We look like rejects from a bank heist,” Molly hissed. “Kids are going to scream. Parents are going to scream. Then the cops scream. I hate screaming.” “Relax. We go in during cake time. All the little monsters gather around the table, eyes bigger than their heads. We snag the gift table while they’re singing Happy Birthday. Easy peasy, lemon squeezy.” Molly stared at him. “Did you just say easy peasy, lemon squeezy? You’re forty-three years old.” “Forty-two and three-quarters, thank you. And statistically kids are distracted by sugar. I read it in a book.” “You read a book? Since when?” “Since I stole it from the library. Point is, we wait for ‘Happy Birthday,’ you create a diversion with the piñata, I stuff the presents in the duffel. Then we vanish like classy ninja turtles.” Molly pinched the bridge of her nose. “A diversion with the piñata. You want me to assault a children’s toy in front of thirty screaming six-year-olds?” “Gently assault. Like a polite shove. Maybe whisper ‘sorry, little buddy’ as you swing.” “I’m not whispering to a piñata, Bill.” “Fine, then you do the grabbing and I’ll do the polite shoving. But I get to keep the giant teddy bear. I’ve always wanted one.” “You can’t even fit in your apartment. Where are you going to put a six-foot teddy bear?” “Next to the stolen microwave. They’ll be roommates. It’ll be cute.” Molly sighed so deeply her fake mustache fluttered. “We’re going to prison for this. I can already see the headline: ‘Middle-Aged Morons Mug Moppets.’ My mother will disown me.” “Your mother disowned you when you stole her garden gnome, remember? This is just escalation.” “That gnome was looking at me funny.” Bill grinned under his ridiculous mask. “See? You’ve got prior experience with whimsical theft. This is your destiny.” “My destiny was supposed to be a nice quiet life of stealing from rich jerks who deserve it, not destroying a five-year-old’s hopes and dreams.” “Think of it as borrowing happiness. We return the joy next week. Slightly used.” “You’re an idiot.” “An idiot with a plan. Cake starts in four minutes. You in or out, partner?” Molly looked at the grinning cartoon characters on the fence, then at Bill’s ridiculous pom-pom nose. She groaned. “Fine. But if a kid cries, I’m blaming you. And I get the talking robot dog. That thing’s worth at least two hundred bucks on eBay.” Bill offered her a fist bump. “That’s my girl. Let’s go ruin some birthdays.” They slipped through the gate just as the first notes of Happy Birthday drifted across the lawn, two thieves in mismatched costumes ready to commit the most pathetic crime of their careers.
created_at
"2026-03-12T21:21:24.103Z"
3/12/2026, 9:21:24 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
298
moderation_latency
(null)
generation_time
3675
tokens_prompt
77
tokens_completion
861
native_tokens_prompt
182
native_tokens_completion
696
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0043096
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"6b132321-1c6b-b768-c7f8-08f6733be840"
is_byok
false
latency
254
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350484-pmA1PWBCXQdtvz9LxCUv"
upstream_id
"6b132321-1c6b-b768-c7f8-08f6733be840"
total_cost
0.0043096
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
517
0.00%
Dialogue to Total Word Ratio
Ratio: 78.33%, Deviation: 48.33%
neededClean
false
wordsTotal
526
wordsDialogue
412
21.6892%