NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHRAPP3KCXSHKFBYC2WDX81
Run Details
Model:
Grok 4.20
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
11.1s
Cost:
$0.001896
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.21%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.1070%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her oversized sunglasses and leaned across the sticky diner booth, her voice a conspiratorial whisper. “Okay, Bill, listen. We hit the party at exactly three-fifteen, right when the clown starts his balloon-animal routine. Those kids will be hypnotized. That’s our window.” Bill slurped his coffee, nearly choking. “A clown? Mol, I’m not fighting a clown. Last time I saw one of those painted freaks I had nightmares for a week. Can’t we just wait till they do cake?” “No cake,” Molly said, jabbing a fry in his direction. “The cake is after presents. We need the gifts before they get opened and suddenly every kid on the block owns a limited-edition Princess Sparkle Wand. You know how fast those things resell on the dark web?” Bill rubbed his temples. “I still don’t understand why we’re robbing a kid’s birthday party. There’s a poker game in the warehouse district. Grown men. Cash. No face paint.” “Because,” Molly hissed, “this is the mayor’s kid. Every parent in the city will send the most expensive, ridiculous presents imaginable. We’re talking drone racers, diamond-encrusted building blocks, a working miniature Ferrari. One haul and we retire to Boca.” “Retire to Boca,” Bill echoed doubtfully. “With what, Molly? Tiny tuxedos and a lifetime supply of juice boxes?” She rolled her eyes. “We fence the toys, genius. There’s an underground market for this stuff. Rich collectors, overseas oligarchs who want their own kids to have the latest garbage. Trust me, I did the research.” Bill leaned in, voice dropping. “Fine. What’s the plan? We can’t exactly stroll in wearing ski masks. Those kids would rat us out before we reached the goodie bags.” Molly grinned like a fox. “Simple. You’re the new hired magician. I’m your lovely assistant, Trixie the Magnificent. We roll up in the van painted like a circus wagon. While you distract them with card tricks that inevitably go wrong, I slip into the gift room, bag everything, and we vanish during the piñata stampede.” “Piñata stampede,” Bill repeated, already exhausted. “You’ve actually thought this through.” “Every detail. I even bought you a top hat with a secret compartment for the silverware we’re definitely not stealing because that would be wrong.” Bill stared at her. “We’re stealing children’s toys but silverware is where you draw the moral line?” “Priorities, Bill. Now, about the clown. If he gets in our way, you hit him with the squirting flower. I’ll handle the parents. They’ll all be half-drunk on rosé and trying to outdo each other with their organic gluten-free contributions. Easy pickings.” Bill sighed, pushing his plate away. “If we get caught, I’m telling the cops you kidnapped me.” Molly smirked, dropping a twenty on the table. “Relax. Worst case scenario, we end up on the evening news as the world’s most pathetic criminals. Best case? We’re sipping mai tais while some hedge-fund brat cries over his missing robot dinosaur. Now finish your coffee. We’ve got costumes to steal and a very important toddler to disappoint.” Bill stood, muttering, “I miss when we robbed banks.” Molly looped her arm through his. “Banks don’t have bouncy castles, sweetheart. Let’s go ruin a childhood.”
created_at
"2026-05-01T12:31:00.808Z"
5/1/2026, 12:31:00 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
333
moderation_latency
(null)
generation_time
11016
tokens_prompt
77
tokens_completion
881
native_tokens_prompt
182
native_tokens_completion
721
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0018956
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"db16445a-c06f-9ff8-873d-1bb5ed92041b"
is_byok
false
latency
67
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777638660-eKgSoFj0VqEqTTPTGEFK"
session_id
(null)
api_type
"completions"
id
"gen-1777638660-dZ29lkJhvjQjmknsftVE"
upstream_id
"db16445a-c06f-9ff8-873d-1bb5ed92041b"
total_cost
0.0018956
cache_discount
0.0001344
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.21%
Matches word count
n/a
neededClean
false
words
528
0.00%
Dialogue to Total Word Ratio
Ratio: 84.11%, Deviation: 34.11%
neededClean
false
wordsTotal
535
wordsDialogue
450
0.1070%