NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9Z94SJ0BX5SEK5E8WT1WB
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
36.6s
Cost:
$0.015965
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
40.6364%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly paced the dimly lit motel room, a crumpled blueprint of the McAllister mansion spread across the bed like a treasure map drawn by a drunken pirate. Bill lounged on a creaky chair, munching on a stale doughnut, his eyes wide with skepticism. "Alright, Bill, this is the big one," Molly said, jabbing a finger at the diagram. "The party's tomorrow afternoon. Rich kids, spoiled brats—think of the presents! iPads, bikes, those fancy drone things. We'll slip in, grab 'em all, and vanish like ghosts at a birthday bash." Bill snorted, crumbs flying. "Ghosts? Molly, we're not exactly Casper material. I'm six-foot-five and you're allergic to subtlety. How do we even get in? Crash the clown car?" She grinned, her eyes sparkling with mischief. "Exactly! I've got us disguises. You'll be Bozo the Clown—big shoes, red nose, the works. I'll be the magician's assistant. We'll blend right in with the chaos." "Bozo? Oh, hell no," Bill groaned, tossing the doughnut aside. "Last time I wore a costume, I tripped over my own feet and face-planted into a wedding cake. Remember Aunt Gladys's reception? We were supposed to lift the silverware, not become the entertainment." Molly laughed, rolling her eyes. "That was hilarious! But this is different. The party's in the backyard—tents, games, a bouncy castle. Parents are clueless, sipping mimosas. We sneak to the gift table, load up the sacks—I've got Santa-sized ones—and bolt through the side gate." Bill leaned forward, squinting at the blueprint. "What about security? Rich folks like McAllister probably have cameras, alarms, maybe a guard dog named Killer." "Pfft, Killer's a poodle," Molly shot back. "And the cameras? We'll hack 'em with that gadget you stole last month. As for alarms, we cut the power to the tent. Boom—darkness, confusion, kids screaming. Perfect diversion." Bill rubbed his chin, a sly smile creeping in. "Screaming kids? That's our soundtrack? You're a twisted genius, Mol. But what if one of those rugrats spots us? 'Mommy, the clown's stealing my unicorn scooter!'" "Then we improvise!" Molly exclaimed, punching his arm. "You juggle some balls, I pull a rabbit out of a hat—distract 'em. If things go south, we blame it on the piñata bandit. Everyone knows piñatas are shady." Bill chuckled, shaking his head. "Piñata bandit? You're killing me. Fine, I'm in. But if I end up with cake in my wig again, you're buying me new shoes." "Deal," Molly said, high-fiving him. "Now, let's rehearse. You be the kid: 'Hey mister, that's my present!'" Bill puffed out his cheeks in mock innocence. "Hey mister, that's my present! Gimme back my robot!" Molly mimed snatching an invisible box. "Sorry, kiddo, but this robot's got a date with eBay. Now scram before I turn you into a frog!" They both burst out laughing, collapsing onto the bed amid the blueprints. The plan was absurd, risky, and utterly ridiculous—but that's what made it fun. Tomorrow, the McAllister party would never know what hit it. Or rather, who Bozo-ed it.
created_at
"2026-02-20T19:56:03.107Z"
2/20/2026, 7:56:03 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8841
moderation_latency
(null)
generation_time
36527
tokens_prompt
77
tokens_completion
783
native_tokens_prompt
744
native_tokens_completion
1027
native_tokens_completion_images
0
native_tokens_reasoning
304
native_tokens_cached
743
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01596525
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"cf392d5d-37d9-0449-e4e1-1ccdb60ffb29"
is_byok
false
latency
217
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771617363-JNsREUvIHEMfAUWMAEdA"
upstream_id
"cf392d5d-37d9-0449-e4e1-1ccdb60ffb29"
total_cost
0.01596525
cache_discount
0.00167175
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
512
0.00%
Dialogue to Total Word Ratio
Ratio: 68.16%, Deviation: 38.16%
neededClean
false
wordsTotal
512
wordsDialogue
349
40.6364%