NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHSXCV58BA8ZJADXW0G6J5S
Run Details
Model:
Grok 4.20
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
10.9s
Cost:
$0.001658
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly clutched a half-eaten balloon animal like it was a map to El Dorado. “Bill, I’m telling you, the clown is the weak link. One good squirt of seltzer in his pants and he’s done. We grab the gift table in the chaos.” Bill peeked through the hedge at the bouncy castle currently throbbing with sugar-crazed six-year-olds. “Mol, we’re not waterboarding Ronald McDonald at a toddler’s birthday party. That’s how you get on a list. We go in as the replacement entertainers. I’m Professor Balloonbeard, world-famous pirate of parties. You’re my lovely assistant, Mistress Snickerdoodle.” “Snickerdoodle is a cookie, you walnut. I’m not being named after baked goods.” Molly adjusted her fake mustache, which immediately slid into her mouth. “Fine. But I’m doing the sleight-of-hand magic. Last time you palmed a deck of cards you dropped them into the punch bowl and started a mosh pit of angry moms.” “That was character-building,” Bill muttered. “Look, we roll up at three-fifteen, right after cake but before the sugar crash. You distract with card tricks. I liberate the presents. There’s a piñata shaped like a dinosaur. We use that as the getaway vehicle.” Molly stared at him. “You want to ride a cardboard T-Rex down Maple Street with thirty wrapped boxes duct-taped to its back?” “Roller skates underneath. I’ve thought this through.” “You’ve thought this through the way a raccoon thinks through garbage day. What about the parents? They’re not exactly blind.” “Parents at these things are zombies. One sip of lukewarm coffee and they enter a catatonic state. We’ll be ghosts. Stealthy, hilarious ghosts.” Molly sighed, rubbing her temples. “Bill, this is a new low. We used to steal diamonds from billionaires. Now we’re plotting to swipe a plastic tea set and whatever hideous Roblox merchandise the birthday boy got. I feel like I should be wearing a tiny hat and a sign that says ‘Will Steal for Juice Boxes.’” “Hey, that Roblox crap sells for bank on the dark web. Plus there’s a limited-edition Lightsaber bubble wand in there. I saw the registry. We fence that bad boy and we eat steak for a month.” “Or we could, I don’t know, get real jobs.” “Real jobs are for people without imagination. Now shut up and help me rehearse my entrance. Arr matey, I be here to swab yer deck—with fun!” Molly watched him strike a ridiculous pirate pose, plastic sword already in hand. Against her better judgment, a grin cracked through the mustache. “If we get caught, I’m selling you out faster than a five-year-old sells out his friends for extra cake.” “Fair,” Bill said, offering her a high-five. “Snickerdoodle.” “I will murder you in your sleep.” “Love you too. Now let’s go ruin a childhood so we can afford rent.”
created_at
"2026-05-01T12:58:41.897Z"
5/1/2026, 12:58:41 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
402
moderation_latency
(null)
generation_time
10830
tokens_prompt
77
tokens_completion
752
native_tokens_prompt
182
native_tokens_completion
626
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0016581
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"31dc00ca-9837-9da4-a541-ea9151c34fac"
is_byok
false
latency
116
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777640321-ZSSJTJl6hBD6c6qKjqsE"
session_id
(null)
api_type
"completions"
id
"gen-1777640321-YTZI9yMGUfuBzjlJXaON"
upstream_id
"31dc00ca-9837-9da4-a541-ea9151c34fac"
total_cost
0.0016581
cache_discount
0.0001344
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
462
0.00%
Dialogue to Total Word Ratio
Ratio: 82.35%, Deviation: 32.35%
neededClean
false
wordsTotal
476
wordsDialogue
392
0.0000%