NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9JX85WHFY37G5DYKHXAHD
Run Details
Model:
Grok 4 Fast
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
7.3s
Cost:
$0.000479
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
51.93%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
25.9628%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly hunched over the rickety kitchen table in their cramped apartment, blueprints of the community center sprawled out like a treasure map gone wrong. The air smelled of stale coffee and yesterday's takeout pizza. Bill paced behind her, his lanky frame casting jittery shadows under the flickering bulb. They were pros at this—well, semi-pros, if you counted the time they "borrowed" a neighbor's lawnmower and never gave it back—but hitting a kids' birthday bash for the haul of presents? That was a new low, even for them. "Look, Bill, it's genius," Molly said, jabbing a finger at the sketch of the party room. "Rich parents, spoiled brats, mountains of wrapped loot under that balloon arch. We slip in as clowns—" "Clowns? Molly, I look like a giraffe in greasepaint. Last time I tried a disguise, the cops thought I was a escaped zoo reject." Bill stopped pacing and leaned over the table, his mustache twitching like it had a mind of its own. She smirked, unfazed. "Exactly! Perfect cover. Kids love clowns. We'll juggle the presents right out the door. You do the silly walk, I'll honk the horn and distract the birthday boy. By the time the parents notice, we're gone with enough toys to fence for a small fortune." Bill scratched his chin, eyeing the diagram's side entrance marked with a red X. "And the rugrats? What if one of 'em clings to my leg? Or worse, starts crying? I hate crying. Reminds me of Aunt Edna's funeral." Molly rolled her eyes, sketching a quick escape route with a chewed pencil. "We'll bribe 'em with candy. Or threaten to cancel the piñata. Come on, it's low-risk. No alarms, no guards—just cake and chaos. We time it for the cake-cutting; everyone's eyes on the candles." He snorted, grabbing a cold slice of pizza from the box. "Low-risk? You're forgetting the helium balloons. One pop, and we're balloon animals in a lineup. What if the host is that soccer mom from the neighborhood watch? She once chased me with a garden hose for 'suspicious loitering'—which was just me checking the mail." "Details, Bill. We'll wear those big red noses. Impenetrable alibis." Molly straightened up, her eyes gleaming with mischief. The plan was ridiculous, sure, but in their line of work, ridiculous paid the rent. She imagined the haul: shiny bikes, video games, enough stuffed animals to start their own black-market zoo. Bill sighed, but a grin cracked his face. "Fine. But if I end up twisting balloons into handcuffs, you're bailing me out. And no singing 'Happy Birthday'—my voice cracks like a bad egg." "Deal," she said, slapping his hand. "Tomorrow, we hit the party supply store. Operation Present Panic begins at noon." As the night wore on, they refined the details amid fits of laughter, oblivious to the fact that their "foolproof" scheme was about as watertight as a colander. Little did they know, the real heist would involve face paint fights and a piñata full of surprises—but that's a story for another page.
created_at
"2026-02-20T19:49:17.713Z"
2/20/2026, 7:49:17 PM
model
"x-ai/grok-4-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1866
moderation_latency
(null)
generation_time
7095
tokens_prompt
77
tokens_completion
763
native_tokens_prompt
216
native_tokens_completion
936
native_tokens_completion_images
0
native_tokens_reasoning
255
native_tokens_cached
215
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00047895
router
(null)
provider_responses
0
endpoint_id
"c9586d9d-97ff-40ca-ae94-22504d6f9b7e"
id
"bd5d7994-8f80-5639-ef9c-6ea475d7b7d9"
is_byok
false
latency
139
model_permaslug
"x-ai/grok-4-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771616957-aoHmrE6bZwouok4zgL9b"
upstream_id
"bd5d7994-8f80-5639-ef9c-6ea475d7b7d9"
total_cost
0.00047895
cache_discount
0.00003225
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
51.93%
Matches word count
n/a
neededClean
false
words
516
0.00%
Dialogue to Total Word Ratio
Ratio: 49.81%, Deviation: 19.81%
neededClean
false
wordsTotal
516
wordsDialogue
257
25.9628%