NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZK662XK0R3T3T8VD073GA
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4.0s
Cost:
$0.004004
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
2.01%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
1.0058%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her oversized sunglasses and peered over the hedge at the bounce house where a dozen sugar-crazed kids screamed like tiny banshees. “This is the dumbest idea you’ve ever had, Bill. And that’s saying something after the Great Supermarket Cart Heist of 2022.” Bill crouched beside her in a ridiculous flower-print shirt two sizes too small, clutching a clipboard like a man who actually had a plan. “Look, the birthday girl’s getting a mountain of presents, Mol. We go in as the ‘surprise entertainers,’ swap the real gifts with the decoy bags, and we’re out before the cake hits the table. Easy.” “Easy?” Molly hissed, whacking him with a plastic trumpet. “We’re two grown adults crashing a children’s party. There will be face painting. There will be a piñata. There will be judgment in the eyes of soccer moms, Bill.” He grinned, utterly unbothered. “That’s why we brought the fake mustaches. I’m Professor Bouncy, world-famous balloon artist. You’re my lovely assistant, Madame Sparkles.” “I hate you so much right now.” “Focus. We enter at 3:15, right after the magician leaves. Kids’ll be distracted by sugar and disappointment. I distract the parents with my award-winning balloon poodle routine—” “Which is just a wobbly sausage dog with anxiety, let’s be honest.” “—while you slip the real presents into the black Hefty bags disguised as trash. Then we walk out the side gate like we’re taking out the recycling. Genius.” Molly rubbed her temples. “What if the birthday girl cries? I can’t steal from a crying child, Bill. I have standards. Low ones, but they exist.” “We leave her one present. The cheap one. The kazoo. She’ll never know the difference.” “You’re going to hell.” “Only if they serve lukewarm coffee. I have a thermos.” A high-pitched shriek erupted from the yard as a clown appeared with terrifying face paint. Molly shuddered. “That clown looks like he’s seen war crimes. We’re going to blend right in.” “Exactly.” Bill checked his novelty watch shaped like a rubber duck. “Ten minutes. You ready, Madame Sparkles?” “I’m ready to reconsider my life choices.” Bill patted her shoulder. “Think of it this way: every spoiled rich kid who opens an empty box tonight will learn a valuable lesson about materialism.” “Or they’ll learn that two idiots in mustaches ruined their birthday.” “Same thing.” He stood, cracking his knuckles. “Let’s go steal some joy, partner.” Molly sighed, pulling on her glittery bowtie. “If we get caught, I’m telling the cops you kidnapped me.” “Fair enough.” Bill offered her a fist bump. “For the greater bad.” She bumped it reluctantly. “For the greater bad.” Together they stepped through the hedge, fake smiles plastered on, ready to ruin the most wholesome event in suburbia. The bounce house squeaked ominously as they approached, like it knew exactly what kind of terrible adults had just arrived.
created_at
"2026-03-12T21:35:11.559Z"
3/12/2026, 9:35:11 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
347
moderation_latency
(null)
generation_time
3830
tokens_prompt
77
tokens_completion
812
native_tokens_prompt
182
native_tokens_completion
645
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0040036
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"6f6e3a38-3dbf-702d-5679-78fb931748a4"
is_byok
false
latency
336
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351311-VWcIUfZP2zDhXaPOex0g"
upstream_id
"6f6e3a38-3dbf-702d-5679-78fb931748a4"
total_cost
0.0040036
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
2.01%
Matches word count
n/a
neededClean
false
words
475
0.00%
Dialogue to Total Word Ratio
Ratio: 67.08%, Deviation: 37.08%
neededClean
false
wordsTotal
480
wordsDialogue
322
1.0058%