NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9JYFTMNRTA3VRR6TWFPCJ
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
33.0s
Cost:
$0.014885
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
3.35%
Dialogue to Total Word Ratio
(details)
1.6748%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slouched on the threadbare couch in their cramped apartment, a blueprint of the suburban McMansion sprawled across the coffee table like a treasure map. Bill paced back and forth, his lanky frame casting erratic shadows under the flickering fluorescent light. The air smelled of stale pizza and desperation, but their eyes gleamed with the thrill of the score: a children's birthday party at the richest kid in town's house, loaded with presents wrapped in shiny paper and topped with bows bigger than their egos. "Alright, Molly, listen up," Bill said, cracking his knuckles like he was about to arm-wrestle fate. "The party's at noon tomorrow. Little Timmy's turning eight, and his parents are loaded. We're talking bikes, video games, maybe even that new drone everyone's raving about. We swoop in, grab the gift pile, and vanish like ghosts at a séance." Molly snorted, twirling a lockpick between her fingers. "Ghosts? Bill, we're more like clowns in this caper. Breaking into a kids' party? What if they have a bouncy castle? You gonna bounce your way out with an armload of Barbies?" Bill stopped pacing and grinned, his teeth flashing like a shark spotting sushi. "That's the beauty of it! Distraction city. While the rugrats are screaming over cake and piñatas, we sneak in through the back. I've got the layout: kitchen door's got a flimsy lock. You pick it in ten seconds flat." She leaned forward, eyeing the blueprint. "Ten seconds? Make it five, or we're toast. And what about the parents? Soccer moms with eyes like hawks. One wrong move, and we're explaining ourselves to a horde of PTA enforcers." "Easy peasy," Bill replied, jabbing at the map. "We go in disguised. You as a caterer—apron, tray of bogus cupcakes. Me as the entertainment. Clown suit, red nose, the works. Who suspects a clown of grand larceny?" Molly burst out laughing, nearly knocking over her coffee. "A clown? Bill, you in polka dots? That's not a disguise; that's a felony fashion statement. Last time you tried funny, you tripped over your own feet and face-planted into a wedding cake." He feigned offense, clutching his chest. "Hey, that cake was asking for it! But seriously, Mol, think about the haul. Presents stacked like a Jenga tower of joy. We load 'em into the van parked out back, then poof—gone before the candles are blown out." She nodded, her smirk fading into focus. "Alright, fine. But we need an exit strategy. What if the kids spot us? Little monsters could rat us out faster than you can say 'happy birthday'." Bill waved her off. "Kids? Bribe 'em with candy. Or juggle. I've been practicing. Watch this!" He grabbed three empty beer cans and tossed them up, only for one to clatter to the floor, rolling under the couch. Molly rolled her eyes. "Impressive. Just don't drop the presents. Okay, timeline: In at 12:30, out by 1:00. Split the loot fifty-fifty. No hogging the good stuff." "Deal," Bill said, extending his hand. They shook on it, the blueprint crinkling beneath their palms. As they huddled closer, plotting contingencies, the absurdity of it all hung in the air like balloon strings—two small-time crooks dreaming big, one kiddie party at a time.
created_at
"2026-02-20T19:49:18.982Z"
2/20/2026, 7:49:18 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6707
moderation_latency
(null)
generation_time
32863
tokens_prompt
77
tokens_completion
818
native_tokens_prompt
744
native_tokens_completion
955
native_tokens_completion_images
0
native_tokens_reasoning
217
native_tokens_cached
743
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01488525
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"d384fe8f-ea55-c46d-857a-4d0593c057ff"
is_byok
false
latency
119
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771616958-t8x00Yglfi6wM0wNjWoj"
upstream_id
"d384fe8f-ea55-c46d-857a-4d0593c057ff"
total_cost
0.01488525
cache_discount
0.00167175
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
546
3.35%
Dialogue to Total Word Ratio
Ratio: 58.24%, Deviation: 8.24%
neededClean
false
wordsTotal
546
wordsDialogue
318
1.6748%