NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY7KCHHV3378X9MGHP38D2R
Run Details
Model:
Grok 4 Fast
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
11.4s
Cost:
$0.000474
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
87.52%
Dialogue to Total Word Ratio
(details)
89.0030%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly hunched over the greasy diner table, her eyes gleaming under the flickering neon sign. Bill slouched across from her, nursing a lukewarm coffee, his face half-hidden by a newspaper clipping of the party invite. The air smelled of burnt fries and desperation. "Alright, Bill, listen up," Molly whispered, tapping the photo of a sprawling backyard bash. "Rich kid's birthday tomorrow. Streamers, cake, and a mountain of presents from every trust-fund uncle in the county. We're talking designer bikes, gaming consoles, enough loot to fence for a year's rent." Bill leaned in, his mustache twitching. "Stealing from kids? That's low, even for us. What if they cry? I hate crying. Remember that pawn shop job? The owner's kid wailed for an hour." Molly snorted, waving him off. "No kids get hurt. We hit during the piñata chaos. You dress as the clown—they hired one, see? Red nose, floppy shoes. I'll be the face painter, blending right in with the soccer moms." "A clown? Me?" Bill's voice cracked. "Molly, I got a phobia. Last circus, I punched a mime. What if some snot-nosed brat pulls my wig off?" She grinned, sketching a crude map on a napkin. "That's the beauty. Distraction central. While the rugrats are whacking candy out of that donkey thing, you scoop the gift table. I've got the van parked two blocks away, disguised as a birthday delivery truck. In and out in ten minutes." Bill rubbed his chin, eyeing the plan. "Presents'll be wrapped. We can't unwrap 'em all—too noisy. And what about the parents? Those helicopter types with iPhones everywhere?" "Easy. We swap 'em," Molly said, her voice dropping to a conspiratorial hiss. "I rig a decoy bag of junk—old toys from that thrift store dumpster dive. You grab the real haul in your oversized clown pockets. Parents too busy snapping pics to notice till morning." He chuckled, but it turned nervous. "Clown pockets? What if I trip on the balloons? Or the dog—invites say family pet included. Labs love red noses." Molly slapped the table, making the salt shaker jump. "Then you juggle, idiot! Entertain the mutt. Look, the party's at 2 p.m., peak naptime for the littlest ones. We case it from the alley first—binoculars, not that you need 'em with your hawk eyes for burrito stands." Bill sighed, folding the napkin into a lopsided hat. "Fine, but if I end up in a bouncy castle chase, you're buying the beers. And no singing 'Happy Birthday'—I sound like a cat in a blender." "Deal," she said, eyes sparkling. "This'll be our masterpiece. From party poopers to payday pros." As the waitress refilled their cups, Bill muttered, "Just hope the cake's not poisoned with kid cooties." Molly laughed. "Only thing poisonous is your sense of humor, partner. Let's rehearse the wiggle—clowns gotta waddle." Bill groaned, but stood, mimicking a ridiculous shuffle. "Toot-toot, incoming loot!"
created_at
"2026-02-20T19:14:36.221Z"
2/20/2026, 7:14:36 PM
model
"x-ai/grok-4-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2536
moderation_latency
(null)
generation_time
11220
tokens_prompt
77
tokens_completion
744
native_tokens_prompt
216
native_tokens_completion
908
native_tokens_completion_images
0
native_tokens_reasoning
239
native_tokens_cached
154
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0004741
router
(null)
provider_responses
0
endpoint_id
"c9586d9d-97ff-40ca-ae94-22504d6f9b7e"
id
"e5c21ae9-e0c3-fd16-1d5f-cb3e9dca6cbd"
is_byok
false
latency
530
model_permaslug
"x-ai/grok-4-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771614876-lZFmkpGhlwgC9r6rHyOk"
upstream_id
"e5c21ae9-e0c3-fd16-1d5f-cb3e9dca6cbd"
total_cost
0.0004741
cache_discount
0.0000231
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
490
87.52%
Dialogue to Total Word Ratio
Ratio: 71.63%, Deviation: 1.63%
neededClean
false
wordsTotal
490
wordsDialogue
351
89.0030%