NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHS174938BCY54NN4DY6253
Run Details
Model:
Grok 4.20
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
11.5s
Cost:
$0.001898
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
51.93%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
25.9628%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her fake clown nose and glared across the greasy diner table. “Bill, if you say ‘it’ll be a piece of cake’ one more time I’m shoving this wig down your throat. We’re robbing a children’s birthday party, not the First National Bank.” Bill grinned around a mouthful of pancake. “Relax, Mol. Kids are easy. They’re distracted by sugar and a guy in a dinosaur costume. We slip in during musical chairs, grab the gift table, and waddle out before the piñata drops.” “Waddle? You’re wearing the giant duck suit, remember? I’m the depressed mime. I already practiced my silent crying in the mirror. It’s Oscar-worthy.” “Fine, fine. Duck it is. But we need a signal if the parents get suspicious. How about I quack twice?” Molly pinched the bridge of her nose. “You quack at all and I’m leaving you handcuffed to the bounce house. Look, the birthday girl is turning seven. Her name’s Madison. She’s obsessed with unicorns. We tell the mom we’re from ‘Magical Moments Entertainment’ and that the real performers are stuck in traffic. Classic.” Bill slurped his coffee. “What if Madison wants us to do magic tricks?” “You do the magic. I’ll stand behind you making it rain confetti and regret. Just don’t pull a rabbit out of your duck tail like last time. That thing was traumatized.” “Hey, the rabbit got a modeling gig after that. Positive PR. Alright, timeline: we hit the party at two-fifteen. Cake’s been served, sugar rush is peaking, adults are half-dead on grocery-store wine. I create a diversion by ‘accidentally’ releasing two dozen balloons into the ceiling fan.” “Balloons into a ceiling fan? Bill, that’s how we get sued.” “Exactly. Chaos is our friend. While they’re chasing squeaky latex carnage, you wheel the gift table out the back door pretending it’s part of a magic act. I follow once I’ve ditched the duck head. We meet at the van behind the bouncy castle.” Molly tapped her fork against her plate. “And the getaway vehicle is definitely not the one with the giant ‘Clown Car Rental’ sticker this time, right? I still have nightmares about that soccer mom chasing us down the highway.” “Completely unmarked. I borrowed my cousin’s plumbing van. Smells like regret and drain cleaner, but it’s clean. Mostly.” She sighed. “We split the haul sixty-forty because I’m the one who has to look heartbroken when the mime makeup runs. Also, if we accidentally steal the wrong unicorn plushie and some kid starts crying, I’m giving it back. I have limits.” Bill leaned forward, eyes sparkling. “Molly, we’re talking iPads, designer sneakers, and whatever rich parents wrap in two-hundred-dollar paper these days. This isn’t stealing. This is… redistribution of birthday wealth.” “Call it whatever helps you sleep, duck-boy. Just remember: if we get caught, I’m selling you out faster than you can say ‘honest mistake.’ I’ll tell the cops you kidnapped me and forced me into grease paint.” He laughed. “Fair. Ready to go ruin a seven-year-old’s party?” Molly slid her sunglasses on and gave a grim nod. “Let’s make some terrible memories.”
created_at
"2026-05-01T12:43:18.543Z"
5/1/2026, 12:43:18 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
344
moderation_latency
(null)
generation_time
11363
tokens_prompt
77
tokens_completion
849
native_tokens_prompt
182
native_tokens_completion
695
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0018978
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"711d3072-fcf9-a869-d3db-fb7029a71914"
is_byok
false
latency
63
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777639398-sAB9NtA1VFr0jqA86Ee3"
session_id
(null)
api_type
"completions"
id
"gen-1777639398-xbEBxrKObTi1QbhHIul0"
upstream_id
"711d3072-fcf9-a869-d3db-fb7029a71914"
total_cost
0.0018978
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
51.93%
Matches word count
n/a
neededClean
false
words
516
0.00%
Dialogue to Total Word Ratio
Ratio: 89.16%, Deviation: 19.16%
neededClean
false
wordsTotal
526
wordsDialogue
469
25.9628%