NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY86NSZZNXG4N4S8Z4FMARX
Run Details
Model:
Grok 4.1 Fast
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
12.4s
Cost:
$0.000520
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0004%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly hunched over the steering wheel of their rusty van, the engine ticking like a guilty conscience as it cooled in the shadows of Elmwood Estates. Bill sprawled in the passenger seat, crumbs from a pilfered bakery donut dusting his Hawaiian shirt. A glittery birthday invitation fluttered between them, emblazoned with "Timmy Turns 8! Presents Galore!" "Look at this haul, Bill," Molly whispered, eyes gleaming like a ferret spotting a henhouse. "Rich kid's bash. Piñata, bounce house, and a mountain of gifts taller than your ego. We snag 'em all, fence 'em by noon, retire to Margaritaville." Bill snorted, wiping his mouth with a sleeve. "Retire? Last heist we planned, you dressed as a nun to rob a bingo hall. Ended up confessing to Father O'Malley for real." Molly slapped the dash, sending a fake mustache tumbling. "That was improv! This is gold. We crash as the entertainment—balloon animals and magic tricks. Kids love clowns. Distract the rugrats, stuff presents in oversized shoes, out the back before cake time." "Clowns?" Bill's face paled. "I got coulrophobia, Mol. Those floppy shoes? Nightmares. And what about the parents? Helicopter moms with iPhones filming every squeak. One viral 'Clown Steals Teddy Bear' vid, and we're toast." She grinned, pulling out a duffel of props: red noses, squirting flowers, a whoopee cushion arsenal. "That's why we go in hot. I twist balloons into getaway cars—poof, misdirection. You do the 'magic gift vanish.' Palm a bike here, a drone there. Kids cry? Blame the piñata fairy." Bill fiddled with a map, tracing the mansion's layout. "Entry: service gate at 2 PM sharp. Bounce house blocks the side yard—perfect cover. But the dog? Guard poodle named Mr. Fluffernutter. One yip, and we're dog chow." "Dognap it in a balloon cage," Molly shot back. "Or spike the punch with sleepy-time Benadryl. Non-lethal, see? We're pros, not monsters." He chuckled despite himself. "Benadryl punch? You're a menace. Okay, timeline: Infiltrate 2:05, load van by 2:45. Exfil via the neighbor's hedge—rented U-Haul disguised as a party rental truck." "Brilliant!" Molly high-fived him, her palm sticky from glitter glue. "Code words: 'Pinch the piñata' means grab and go. If busted, cry 'Allergy attack!' and flee." Bill nodded, sketching a crude clown face on the invite. "One hitch: Timmy's got a twin sister. Double the terror. But hell, for a haul like this—Xboxen, Lego empires, cash-stuffed cards—I'm in. Just no singing 'Happy Birthday' off-key." Molly revved the van to life, headlights slicing the dusk. "Deal. Tomorrow, we make heist history. Or at least fund our bar tab." As they peeled out, Bill muttered, "Famous last words." The van rattled toward destiny, stuffed with dreams of sticky-fingered glory and the faint honk of a rogue clown horn.
created_at
"2026-02-20T19:25:08.3Z"
2/20/2026, 7:25:08 PM
model
"x-ai/grok-4.1-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
739
moderation_latency
(null)
generation_time
12228
tokens_prompt
77
tokens_completion
801
native_tokens_prompt
216
native_tokens_completion
998
native_tokens_completion_images
0
native_tokens_reasoning
299
native_tokens_cached
151
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00051955
router
(null)
provider_responses
0
endpoint_id
"9821aa0e-dce3-4c27-9716-161359123d0b"
id
"2817f5d6-aeb6-1f17-8ecd-7bf4db53eaf7"
is_byok
false
latency
446
model_permaslug
"x-ai/grok-4.1-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771615508-SztO6GwPu2FDgypBt9N8"
upstream_id
"2817f5d6-aeb6-1f17-8ecd-7bf4db53eaf7"
total_cost
0.00051955
cache_discount
0.00002265
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
467
0.00%
Dialogue to Total Word Ratio
Ratio: 63.60%, Deviation: 33.60%
neededClean
false
wordsTotal
467
wordsDialogue
297
0.0004%