NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9CXK91DQ38X8T3KNMM4XF
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
39.6s
Cost:
$0.014300
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
21.6894%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly leaned over the rickety table in their dimly lit motel room, a crumpled blueprint of the suburban McMansion spread out before her. Empty coffee cups and half-eaten donuts littered the space, and Bill paced back and forth, his wiry frame twitching with excitement. The air smelled of stale cigarettes and desperation—the perfect ambiance for plotting a heist at little Timmy's eighth birthday bash. "Alright, Bill, listen up," Molly said, jabbing a finger at the map. "The party's at noon tomorrow. Clowns, balloons, the works. Presents piled high in the living room like a dragon's hoard. We go in disguised as caterers—apron, chef hat, the whole nine yards." Bill stopped pacing and grinned, his gold tooth glinting under the fluorescent light. "Caterers? Brilliant! But what if they ask for credentials? I ain't got no food handler's card. Last time I cooked, I burned water." Molly rolled her eyes. "Fake it, dummy. We'll whip up some bogus business cards. 'Party Grub Pros'—sounds legit. You distract the parents with bad jokes while I swipe the gifts into our duffel bags hidden in the catering cart." Bill chuckled, rubbing his hands together. "Bad jokes? I'm a natural. 'Why did the scarecrow win an award? He was outstanding in his field!' Boom, they're laughing, I'm slipping presents out the back door." She smirked. "Just don't overdo it. Remember the bank job? You told that knock-knock joke and the teller hit the alarm mid-punchline." "That was a setup! Knock-knock jokes are gold!" Bill protested, feigning offense. "Anyway, what's the exit strategy? Kids are unpredictable. What if one spots us and starts yelling?" Molly nodded thoughtfully. "Good point. We bribe 'em with candy from the piñata. Kids love sugar. 'Hey, kiddo, want a lollipop? Keep quiet about the nice lady with the bag of toys.' Easy peasy." Bill leaned in, eyes wide. "Toys? We're stealing presents, Moll. What if there's something good in there? Like a new video game console? Can we keep one?" "Focus, Bill! This is about the cash resale value. eBay those Barbies and Legos, we're golden. No keeping souvenirs—that's how amateurs get caught." He sighed dramatically. "Fine, Miss Professional. But imagine the haul: bikes, dolls, maybe even a pony if it's a rich kid's party. We'd be set for months!" Molly folded the blueprint, her voice dropping to a whisper. "Exactly. In and out in twenty minutes. Park the van two blocks away, engine running. If things go south, we bolt with what we got." Bill clapped her on the back. "You're a genius, Moll. This'll be the heist of the century—stealing from kiddos. Who needs Ocean's Eleven when you've got Birthday Bash Bandits?" She laughed, shaking her head. "Just don't trip over any balloons. Last thing we need is you popping one and starting a stampede." As they high-fived, the motel clock ticked toward dawn, their absurd plan taking shape amid the chaos of crumpled wrappers and caffeine-fueled dreams. Little did they know, the real challenge awaited: outsmarting a room full of sugar-high eight-year-olds.
created_at
"2026-02-20T19:46:01.461Z"
2/20/2026, 7:46:01 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8111
moderation_latency
(null)
generation_time
39518
tokens_prompt
77
tokens_completion
792
native_tokens_prompt
744
native_tokens_completion
916
native_tokens_completion_images
0
native_tokens_reasoning
222
native_tokens_cached
743
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01430025
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"bbf72767-bd01-0d12-a350-5eddc3b13113"
is_byok
false
latency
159
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771616761-zcdXoFHw7chiZyo9GB4s"
upstream_id
"bbf72767-bd01-0d12-a350-5eddc3b13113"
total_cost
0.01430025
cache_discount
0.00167175
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
517
0.00%
Dialogue to Total Word Ratio
Ratio: 65.57%, Deviation: 15.57%
neededClean
false
wordsTotal
517
wordsDialogue
339
21.6894%