NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY8MBA3XZ03FG5B09VAZP69
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
34.8s
Cost:
$0.014465
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
8.06%
Dialogue to Total Word Ratio
(details)
47.2222%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly paced the cramped motel room, her wild red curls bouncing like overcaffeinated springs. Bill slouched on the bed, fiddling with a crumpled map of the suburban neighborhood, his greasy comb-over failing to hide his bald spot. "Alright, Bill, listen up," Molly said, slapping a glittery invitation onto the table. "This kids' party at the McMansion on Elm Street? It's our golden ticket. Rich parents, piles of presents, and zero security 'cause who suspects clowns at a birthday bash?" Bill snorted, peering at the invite through his cracked bifocals. "Clowns? Molly, you're a genius or a nutjob. Last time we dressed as clowns, I tripped over my floppy shoes and face-planted into the punch bowl. Remember? The mayor's gala? We got nada but fruit punch stains." "That's 'cause you picked Ronald McDonald knockoffs," Molly shot back, rolling her eyes. "This time, we're going pro. Harlequin chic. I'll be the sassy jester, you the bumbling fool—typecasting, right? We infiltrate during the piñata smash. Kids are distracted, whacking that papier-mâché donkey like it's owed money." Bill chuckled, rubbing his stubbly chin. "Piñata? Hell, I could be the piñata. Stuff me with candy and let 'em swing. But seriously, Mol, how we hauling the loot? Those presents ain't gonna sprout legs and waddle out." "Easy peasy," Molly said, grinning like a fox in a henhouse. "We bring the 'Magic Gift-mobile'—that old ice cream truck we boosted last month. Park it out back, load 'em up while the rugrats are sugar-rushing on cake. 'Free ice cream!' we yell. Parents think it's part of the party. Boom, we're ghosts with goodies." Bill leaned forward, eyes narrowing. "And the adults? Soccer moms with yoga pants and laser eyes? One spots us, and we're toast. Remember Aunt Bertha's bingo heist? She clocked us from across the hall." Molly waved him off. "Distraction, dummy. You juggle flaming pins—nah, too risky with your butterfingers. Fake a balloon animal explosion. Pop! Chaos! While they're screaming, I swipe the gift table. It's foolproof." "Foolproof? Like our museum gig where you 'accidentally' set off the sprinklers?" Bill mocked, mimicking her voice. "'Oops, wrong wire!' We swam out with soggy masterpieces." "Shut it, that was art," Molly retorted, punching his arm. "This is kids' stuff—toys, gadgets, maybe some cash-stuffed cards. We're talking resale gold on eBay. 'Vintage Barbie, slightly party-worn.'" Bill sighed, but a smirk crept in. "Fine, but I call dibs on any Legos. My nephew's birthday's coming. What's the timeline? In at noon, out by cake-cutting?" "Precisely," Molly said, tapping the map. "Enter via the bouncy house—crawl in, pop out like jack-in-the-box pros. Exit through the kitchen, snag some cupcakes for the road." "You're twisted, Mol. Robbing kiddos? Low even for us." "Comedy gold, Bill. Imagine the headlines: 'Clown Crooks Crash Kiddie Fest!' We'll be legends." Bill laughed outright. "Alright, partner. Let's clown around."
created_at
"2026-02-20T19:32:36.301Z"
2/20/2026, 7:32:36 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8645
moderation_latency
(null)
generation_time
34784
tokens_prompt
77
tokens_completion
774
native_tokens_prompt
744
native_tokens_completion
927
native_tokens_completion_images
0
native_tokens_reasoning
211
native_tokens_cached
743
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01446525
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"e9e13069-972a-d4c7-4880-133ce52132f4"
is_byok
false
latency
695
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771615956-nx9vqCEC4YuiXSST0Xzk"
upstream_id
"e9e13069-972a-d4c7-4880-133ce52132f4"
total_cost
0.01446525
cache_discount
0.00167175
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
489
8.06%
Dialogue to Total Word Ratio
Ratio: 77.10%, Deviation: 7.10%
neededClean
false
wordsTotal
489
wordsDialogue
377
47.2222%