NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY7VM0HXYBZ9XECNM22FZ57
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
36.9s
Cost:
$0.015599
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
21.6892%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly paced the cluttered living room of their rundown apartment, a blueprint of the suburban McMansion spread out on the coffee table like a battle map. Bill lounged on the threadbare couch, munching on stale popcorn, his eyes gleaming with mischief. "Alright, Bill, listen up," Molly said, jabbing a finger at the blueprint. "The party's at noon tomorrow. Rich kids, richer parents. We're talking piles of presents – bikes, gadgets, maybe even that new gaming console everyone's after. We hit 'em fast, grab the loot, and vanish like ghosts." Bill grinned, kernels flying as he spoke. "Ghosts? Nah, we're going in as clowns. Big shoes, red noses, the works. Kids love clowns. We'll blend right in." Molly rolled her eyes. "Clowns? You're kidding. Last time you wore a wig, it caught fire from that faulty smoke bomb. No clowns. We go as caterers. Fake uniforms, trays of cupcakes. Slip in through the kitchen, load up the gifts in laundry bags disguised as trash." "Trash? That's your big idea?" Bill snorted, sitting up. "Molly, you're thinking small. Picture this: I rappel from the roof, superhero style, while you distract the parents with a fake raffle. 'Congratulations, you've won a cruise!' Boom, they're all outside, and I sweep the presents into a Santa sack." Molly crossed her arms. "Rappel? With your beer gut? You'd crash through the skylight and land in the punch bowl. Stick to the plan. We park the van as 'Party Planners Extraordinaire.' I charm the mom – 'Oh, we're here to set up the piñata!' – while you sneak upstairs to the gift room." Bill leaned forward, eyes wide. "Upstairs? What if there's a nanny cam? Or worse, one of those yappy dogs? I say we use the kid chaos. Start a game of musical chairs, crank the volume, and in the mayhem, we nab the boxes. 'Hey, kiddos, who wants candy?' Distract 'em with lollipops laced with... wait, no, just regular lollipops. We're not monsters." She laughed despite herself. "Lollipops? Bill, you're a genius at being an idiot. Fine, but no candy – allergies could turn this into a hospital heist. We go low-tech: I trip the birthday boy into the cake, cause a diversion. You wheel out the presents in a stroller. 'Just borrowing these for the photo booth!'" Bill slapped his knee. "Photo booth! Now you're talking. I'll pose as the photographer. 'Smile for the birdie!' Snap a few, then stuff the lens with wrapping paper and bolt. But what about the alarms? That house has more security than Fort Knox." Molly smirked. "Hack the Wi-Fi? Nah, too fancy. We bribe the pizza delivery guy to 'accidentally' cut the power line. Lights out, party panic, we grab and go." "Brilliant! But Molly, one problem: what if the kids cry? Stealing presents from toddlers? That's low, even for us." She paused, then shrugged. "Comedy gold. We'll leave 'em IOUs from the Grinch. 'Better luck next year!' Now, let's rehearse. You be the whiny kid." Bill whined in a high pitch. "But I wanted the pony!" Molly chuckled. "Pony? Dream on, kid. We're taking that too."
created_at
"2026-02-20T19:19:06.011Z"
2/20/2026, 7:19:06 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8995
moderation_latency
(null)
generation_time
36781
tokens_prompt
77
tokens_completion
780
native_tokens_prompt
744
native_tokens_completion
996
native_tokens_completion_images
0
native_tokens_reasoning
273
native_tokens_cached
699
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01559925
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"4864afae-70ce-6d24-16b8-7ff1065f3a78"
is_byok
false
latency
56
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771615146-tZufATgv5wnNLsrFJ32y"
upstream_id
"4864afae-70ce-6d24-16b8-7ff1065f3a78"
total_cost
0.01559925
cache_discount
0.00157275
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
517
0.00%
Dialogue to Total Word Ratio
Ratio: 81.62%, Deviation: 51.62%
neededClean
false
wordsTotal
517
wordsDialogue
422
21.6892%