NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYA8FGFCZVB49MF2PSFSM3F
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
36.6s
Cost:
$0.013955
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her fake mustache in the cracked mirror of their rundown van, parked discreetly behind the suburban cul-de-sac. The air smelled of stale coffee and desperation, but tonight's gig promised a haul of shiny, gift-wrapped loot. Bill, her partner in crime and eternal optimist, hunched over a crumpled blueprint of the McAllister house, where little Timmy's birthday bash was in full swing. Balloons bobbed outside, and the distant squeals of sugar-high kids echoed like a siren song for petty thieves. "Alright, Bill, spill it," Molly said, twirling a lockpick like a baton. "How do we crash this kiddie apocalypse without ending up in clown makeup?" Bill grinned, his gold tooth catching the dashboard light. "Simple, Mol. We go in as the entertainment. I've got these knockoff clown suits from that flea market heist last month. You be Bubbles the Clown, I'll be Wacky Wally. We juggle, we distract, and boom—presents vanish into our oversized pockets." Molly snorted, eyeing the blueprint. "Juggling? You can barely tie your shoes without tripping. And what about the parents? They're not idiots. One wrong move and we're dodging soccer moms with pepper spray." "Relax," Bill replied, waving a hand. "I've scoped it. The dad's grilling burgers out back, mom's glued to her phone. Kids are hyped on cake—pure chaos. We slip in during the piñata smash. You grab the gift table while I create a diversion. Maybe fake a balloon animal explosion." She raised an eyebrow. "Balloon animal explosion? That's your big idea? Last time you 'diverted,' we almost got nabbed by a schnauzer." Bill leaned back, crossing his arms. "Hey, that dog was vicious. But think about it: presents galore—toys, gadgets, maybe even cash envelopes. We fence 'em downtown, split the profits. Easy peasy." Molly paced the van's narrow aisle, her boots scuffing old pizza boxes. The plan was absurd, but that's what made their partnership work—Bill's wild schemes balanced by her sharp caution. Outside, party lights flickered, casting rainbow shadows. She imagined the stack of brightly wrapped boxes, ripe for the taking. A remote-control car here, a dollhouse there. Jackpot. "Fine, but we need an exit strategy," she said. "No lingering for cake. In and out in ten minutes. And if things go south?" Bill chuckled. "We abort and blame it on the ice cream truck. Yell 'Free cones!' and watch the stampede." Molly laughed despite herself. "You're an idiot, but okay. Let's do this. For the presents." They high-fived, the sound echoing in the van. Bill revved the engine softly, pulling closer to the house. As they donned their clown wigs, Molly felt the familiar thrill of the heist bubbling up. Kids' parties were unpredictable, but so were they. With a final nod, they stepped out into the night, ready to turn birthday cheer into their payday. The McAllister gift pile wouldn't know what hit it.
created_at
"2026-02-20T20:01:04.538Z"
2/20/2026, 8:01:04 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8823
moderation_latency
(null)
generation_time
36502
tokens_prompt
77
tokens_completion
745
native_tokens_prompt
744
native_tokens_completion
893
native_tokens_completion_images
0
native_tokens_reasoning
240
native_tokens_cached
743
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01395525
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"204867d4-fd99-ca5b-b9fb-ac8c2ad97bb1"
is_byok
false
latency
631
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771617664-VuXBryxFqJZeAB3yQ5u1"
upstream_id
"204867d4-fd99-ca5b-b9fb-ac8c2ad97bb1"
total_cost
0.01395525
cache_discount
0.00167175
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
0.00%
Dialogue to Total Word Ratio
Ratio: 48.25%, Deviation: 18.25%
neededClean
false
wordsTotal
485
wordsDialogue
234
30.1376%