NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY97BX75GV83RTDSNXSVVN4
Run Details
Model:
Grok 4 Fast
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
8.6s
Cost:
$0.000493
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
14.30%
Matches word count
(details)
8.37%
Dialogue to Total Word Ratio
(details)
11.3337%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill slumped into the creaky booth at the back of the dingy diner, his trench coat smelling faintly of cheap cologne and regret. Across from him, Molly nursed a lukewarm coffee, her eyes sparkling with that manic gleam she got before a job. The neon sign outside buzzed like a swarm of angry bees, casting flickering shadows over their map of the upscale suburb. "Alright, Molly, spill it," Bill said, unwrapping a greasy burger. "Why the hell are we hitting a kiddie birthday bash? Last time we went for gold bars, not goddamn piñatas." Molly leaned in, her voice a conspiratorial whisper that somehow carried over the sizzle of the grill. "Think about it, Bill. Rich parents, spoiled brats turning five or whatever. The place'll be crawling with presents—unwrapped, shiny, easy to grab. No alarms on toy trains, right? We swoop in, load up a van full of Lego sets and Barbie dreams, fence 'em for quick cash. It's genius!" Bill snorted, ketchup dripping from his chin. "Genius? It's a clown show waiting to happen. Kids screaming, balloons popping like gunfire. How do we even get inside without some soccer mom calling the cops on our 'suspicious adult behavior'?" She grinned, tapping the map with a chipped nail. "Disguises, dummy. We go as party entertainment. I'll be the face painter—slap on some warpaint as a butterfly, and boom, I'm invisible. You... you can be the magician. Pull rabbits out of hats, distract the rugrats while I stuff the goodie bags into my purse." "A magician? Me?" Bill barked a laugh, nearly choking on his fry. "I can't even pull off sleight of hand without dropping the deck. Remember that casino gig? I fumbled the chips into the dealer's lap. We'd be busted before the cake's cut." Molly waved him off, her excitement bubbling. "Details, Bill! We'll practice. You wave a wand, say 'abracadabra,' and presto—presents vanish into thin air. Or our duffel bags. The kids'll think it's part of the show. 'Hey, where'd my remote-control dinosaur go?' Hilarious!" He rubbed his stubbled jaw, eyeing her skeptically. "And the parents? Those helicopter types with iPhones glued to their hands? One wrong twitch, and we're viral villains on TikTok." "Relax," she cooed, sketching a quick floor plan on a napkin. "We time it for the cake smash—chaos central. I create a diversion: fake a piñata explosion with confetti bombs. You grab the haul from the gift table. In and out in ten minutes. Vans parked out back, disguised as the catering service. Easy peasy, lemon squeezy." Bill paused, burger halfway to his mouth, a slow grin creeping across his face. "You know, Mol, this might actually work. Beats robbing banks with laser grids. Though if a kid bites me, I'm out." She clinked her mug against his soda. "Deal. To the great present heist of '23—may the force be with our loot!" As they huddled over the napkin, laughter echoing in the empty diner, Bill couldn't shake the feeling this caper would be their funniest flop yet. Or jackpot. With Molly, it was always a toss-up.
created_at
"2026-02-20T19:42:59.509Z"
2/20/2026, 7:42:59 PM
model
"x-ai/grok-4-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2222
moderation_latency
(null)
generation_time
8302
tokens_prompt
77
tokens_completion
777
native_tokens_prompt
216
native_tokens_completion
946
native_tokens_completion_images
0
native_tokens_reasoning
242
native_tokens_cached
154
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0004931
router
(null)
provider_responses
0
endpoint_id
"c9586d9d-97ff-40ca-ae94-22504d6f9b7e"
id
"2d954a60-5e13-efb1-10d3-5bc16c938db6"
is_byok
false
latency
75
model_permaslug
"x-ai/grok-4-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771616579-nStJYGfjQzExFfbUWfQI"
upstream_id
"2d954a60-5e13-efb1-10d3-5bc16c938db6"
total_cost
0.0004931
cache_discount
0.0000231
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
14.30%
Matches word count
n/a
neededClean
false
words
521
8.37%
Dialogue to Total Word Ratio
Ratio: 62.96%, Deviation: 7.04%
neededClean
false
wordsTotal
521
wordsDialogue
328
11.3337%