NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY7RJNTDAGEMRG39Z6AW9WF
Run Details
Model:
Grok 4 Fast
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
7.1s
Cost:
$0.000442
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0004%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly hunched over the rickety kitchen table in their cramped apartment, a half-eaten pizza slice dangling from her fingers. The map of the suburban neighborhood was spread out like a battlefield, marked with crayon scribbles—ironic, given the target. Bill paced behind her, his lanky frame casting jittery shadows under the flickering bulb. He adjusted his fake mustache, which kept slipping. "Alright, Bill, spill it," Molly said, tapping the map with a greasy finger. "Why a kids' party? We're pros—art galleries, jewelers. Not some clown-infested birthday bash." Bill stopped pacing, grinning like he'd cracked the code to Fort Knox. "Think about it, Mol. No security. Parents too busy snapping pics and force-feeding cake. Presents piled high—Xbox, bikes, the works. Easy grab, melt into the chaos. We're ghosts in balloon land." She snorted, wiping sauce from her chin. "Ghosts? With what disguise? You gonna wear a tutu?" "Better," he said, pulling a duffel from under the table. Inside: oversized elf costumes, complete with floppy hats and jingly bells. "Santa's helpers gone rogue. We crash as entertainers. I do a magic trick—poof, presents vanish into my sack." Molly's eyes widened, then narrowed. "Magic? Last time you 'magicked' something, we almost got pinched for that slot machine fiasco. And the kids? Little terrors with sticky hands and zero chill. One scream, and it's game over." Bill waved her off, slipping on the elf hat. It sagged over his ears. "Distraction's key. You handle the diversion—spill the punch bowl, start a piñata riot. I'll sweep the gift table. In and out in ten. Load the van with the haul, fence it downtown. We're talking five figures, easy. Beats another dry run on that pawn shop." She leaned back, crossing her arms. "Risks? Cops love kid stuff. 'Thieves ruin birthday!' Front-page sob story." "Low risk, high reward," Bill countered, striking a pose. "Elf ears cover the tats. Fake accents—say, Australian elves? No one questions festive weirdos. And if it goes south, we blame the bunny mascot. Always the bunny's fault." Molly chuckled despite herself, tracing the route on the map. "Fine, but we rehearse. No ad-libbing your 'magic.' And if a kid latches on, you're on diaper duty—metaphorically." "Deal," Bill said, high-fiving her. The mustache finally fell into the pizza box. "To the heist of the century: Operation Pinata Payday!" As they plotted escape routes and contingency plans—backup involving a rented bouncy castle for quick getaway—Molly felt the familiar thrill. Stealing from adults was business; from kids? Pure, anarchic comedy gold. The party was Saturday. By Sunday, they'd be laughing all the way to the bank, bells jingling in their wake. But deep down, she wondered if Bill's elf hat would survive the first sugar-rushed toddler hug.
created_at
"2026-02-20T19:17:26.338Z"
2/20/2026, 7:17:26 PM
model
"x-ai/grok-4-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1557
moderation_latency
(null)
generation_time
6965
tokens_prompt
77
tokens_completion
731
native_tokens_prompt
216
native_tokens_completion
862
native_tokens_completion_images
0
native_tokens_reasoning
214
native_tokens_cached
215
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00044195
router
(null)
provider_responses
0
endpoint_id
"c9586d9d-97ff-40ca-ae94-22504d6f9b7e"
id
"5acc745f-512e-d001-6920-ed3b3720fbfc"
is_byok
false
latency
79
model_permaslug
"x-ai/grok-4-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771615046-OjYesVRg972tKFCKX7Ko"
upstream_id
"5acc745f-512e-d001-6920-ed3b3720fbfc"
total_cost
0.00044195
cache_discount
0.00003225
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
467
0.00%
Dialogue to Total Word Ratio
Ratio: 50.11%, Deviation: 20.11%
neededClean
false
wordsTotal
467
wordsDialogue
234
0.0004%