NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYAWBS5PCQF580CGA6EE1S8
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
33.7s
Cost:
$0.013708
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
43.1902%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly paced the dingy motel room, her wild red curls bouncing like over-caffeinated squirrels. Bill lounged on the bed, picking at a half-eaten pizza, his lanky frame sprawled out like he'd melted into the comforter. "Alright, Bill, listen up," Molly said, slamming a crumpled blueprint onto the table. "This kids' party at the McMansion tomorrow is our ticket to retirement. Presents stacked higher than my ex's lies. We're talking bikes, toys, maybe even that new gaming console everyone's raving about." Bill sat up, wiping grease from his chin. "Retirement? Molly, we're stealing from kids. That's like robbing the tooth fairy. What if we get caught? Those little monsters could turn on us with water balloons or something." Molly rolled her eyes. "Oh, please. Kids are easy. Distract 'em with candy and they're putty. The plan: We go in disguised as clowns. You know, big shoes, red noses—the works. I'll juggle to keep the rugrats occupied while you sneak into the gift room." "Clowns? Molly, I hate clowns. Remember that birthday party in '09? I tripped over my own feet and face-planted into the cake. The client thought it was part of the act, but I still taste frosting in my nightmares." She smirked. "That's why you're perfect. Authentic clumsiness sells it. We'll park the van out back, load up the presents, and poof—gone before the piñata breaks. Estimated haul: fifty gifts, easy ten grand on the black market." Bill scratched his head. "Black market for Barbies? Who's buying that? Some creepy collector?" "Details, details," Molly waved him off. "Focus. Entry point: The bouncy house. We deflate it quietly, slip through the side door. You handle the deflation— you're good with hot air." "Hey, that's not fair! Last heist, I was the one who hot-wired the getaway car while you flirted with the guard." Molly laughed. "Flirted? I was distracting him with my 'lost puppy' routine. Worked like a charm. Now, contingencies: If the parents spot us, we improvise. You start a conga line, I'll grab the loot." "A conga line? In clown suits? We'll look like escaped circus rejects." "Exactly! Blends right in. And if things go south, we abort with the smoke bombs disguised as party poppers." Bill grinned despite himself. "Smoke bombs? Where'd you get those?" "Online. 'Magic tricks for beginners.' Ironclad." He leaned forward. "Okay, I'm in. But if a kid asks for a balloon animal, you're on your own. I can only make snakes, and they look like limp noodles." Molly high-fived him. "Deal. Tomorrow, we make those presents disappear faster than cake at a diet convention." As they pored over the blueprint, giggling over potential mishaps, the room filled with the absurd energy of two crooks plotting the heist of the century—or at least the birthday bash. Little did they know, the real twist awaited in the form of a hyperactive birthday boy with a slingshot.
created_at
"2026-02-20T20:11:56.08Z"
2/20/2026, 8:11:56 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6919
moderation_latency
(null)
generation_time
33612
tokens_prompt
77
tokens_completion
751
native_tokens_prompt
744
native_tokens_completion
869
native_tokens_completion_images
0
native_tokens_reasoning
201
native_tokens_cached
693
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01370775
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"acdff434-4ca7-fac0-8bc3-c0d57b18f2a6"
is_byok
false
latency
92
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771618316-9SzK2WxW05a57v8kpsdH"
upstream_id
"acdff434-4ca7-fac0-8bc3-c0d57b18f2a6"
total_cost
0.01370775
cache_discount
0.00155925
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
489
0.00%
Dialogue to Total Word Ratio
Ratio: 73.01%, Deviation: 23.01%
neededClean
false
wordsTotal
489
wordsDialogue
357
43.1902%