NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYAPAJCHVEZG8D62T0NJKTX
Run Details
Model:
Grok 4 Fast
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
6.0s
Cost:
$0.000455
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
21.6892%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly hunched over the rickety kitchen table in their cramped apartment, blueprints of the upscale suburban house spread out like a treasure map. Pinwheels and balloon doodles littered the edges, mocking the absurdity of their target: a kiddie birthday bash for some oil baron's spoiled spawn. Bill paced behind her, nursing a lukewarm coffee, his mustache twitching like it had a mind of its own. "Listen, Molly," Bill started, stopping to jab a finger at the diagram. "This ain't no museum job. We're talking snot-nosed brats, face paint, and a piñata full of candy that'll stick to the loot like glue." Molly grinned, her eyes sparkling with mischief. She adjusted her oversized glasses, the kind that screamed "harmless librarian" but hid a safe-cracker's soul. "That's the beauty of it, Bill! Kids' parties are chaos central. No one's watching the presents when Elmo's singing off-key. We slip in as the caterers—I've got the uniforms from that wedding gig last month." Bill snorted, rubbing his chin. "Caterers? You mean those frilly aprons? I'll look like a deranged cupcake. And what about the clown? I heard he's got a black belt in balloon animals." She waved him off, circling a spot on the map with a red pen. "Forget the clown. We time it for the cake-cutting frenzy. You create a diversion—trip over a toy or something, draw the eyes. I'll hit the gift table: grab the shiny ones first, stuff 'em in the van. In and out in ten minutes, tops." He leaned in, squinting. "Diversion? Last time I 'diversed,' I ended up with a parrot on my head and the cops on speed dial. These presents—wrapped in glitter and bows—how do we haul 'em without rattling like a tambourine orchestra? And the kids? One screamer, and it's game over." Molly chuckled, stacking fake mustache props beside the plans. "We'll use those insulated duffels from the jewelry score. Muffle the noise. As for the rugrats, bribe 'em with extra cupcakes. 'Hey, kiddo, want a pony ride? Uncle Bill's got candy!' You're a natural with the little monsters." Bill groaned, but a smirk cracked his face. "Uncle Bill? You're killing me. Fine, but if that birthday boy spots us, I'm blaming your pie-in-the-sky schemes. What's the take? Designer dolls and Xbox-whatevers?" "Jackpot," she said, high-fiving him. "Resell on the dark web—easy five figures. Plus, the thrill of stealing joy from the one percent. We deserve it after that botched bakery heist." He nodded, folding the map. "Alright, partner. Tomorrow at noon. But if I end up in a bouncy castle, you're springing me solo." As the sun dipped low, casting long shadows over their plotting den, Molly and Bill clinked mugs, the heist bubbling like illicit champagne. Little did they know, the real wildcard was a hyperactive piñata donkey with a vendetta.
created_at
"2026-02-20T20:08:38.231Z"
2/20/2026, 8:08:38 PM
model
"x-ai/grok-4-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1329
moderation_latency
(null)
generation_time
5560
tokens_prompt
77
tokens_completion
719
native_tokens_prompt
216
native_tokens_completion
870
native_tokens_completion_images
0
native_tokens_reasoning
221
native_tokens_cached
154
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0004551
router
(null)
provider_responses
0
endpoint_id
"c9586d9d-97ff-40ca-ae94-22504d6f9b7e"
id
"bb1ce2ae-dba4-1822-25a8-a9aef6a8c6fc"
is_byok
false
latency
204
model_permaslug
"x-ai/grok-4-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771618118-JaiqPMtzPrge7gamUi5x"
upstream_id
"bb1ce2ae-dba4-1822-25a8-a9aef6a8c6fc"
total_cost
0.0004551
cache_discount
0.0000231
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
0.00%
Dialogue to Total Word Ratio
Ratio: 61.70%, Deviation: 31.70%
neededClean
false
wordsTotal
483
wordsDialogue
298
21.6892%