NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY8BH7EZ2C9Q5Y4FM18Q5XD
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
27.4s
Cost:
$0.014587
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her binoculars, peering through the grimy window of their rundown van parked across from the bustling suburban house. Balloons bobbed in the breeze, and the distant squeals of excited kids mixed with the chaotic tunes of a hired clown's horn. Bill, slouched in the driver's seat with a half-eaten donut in hand, scratched his scruffy beard and grinned. "This is genius, Molly. A kids' birthday bash? Presents piled higher than my gambling debts. We'll be in and out like ghosts at a seance." Molly rolled her eyes, lowering the binoculars. "Genius? Bill, it's a clown-themed apocalypse in there. One wrong move, and we're dodging dodgeballs from hyperactive eight-year-olds. You really think we can pull this off without ending up covered in cake?" Bill chuckled, crumbs flying. "Cake's a bonus! Look, the plan's solid. We wait till the piñata breaks—total distraction. Parents are busy herding sugar-rushed monsters, clown's juggling fire or whatever. We slip in through the back door, grab the gift table, and vanish. Easy peasy." She smirked, folding her arms. "Easy? You've seen the guest list? It's like a PTA convention. And what about the dog? That fluffy beast looks like it could sniff out a felony from a mile away." "Ah, the mutt. That's why I brought these." Bill rummaged in his duffel bag and pulled out a handful of bacon-scented chew toys. "Distraction numero dos. Toss one, and Fido's our best friend. Or at least too busy to bark." Molly leaned back, considering. "Okay, fine. But disguises? We can't just waltz in looking like escaped convicts." Bill's eyes lit up. "Disguises! I've got us covered. You're going as a balloon artist—big floppy hat, polka-dot dress. I'll be the pizza delivery guy. Who suspects pizza at a party? It's camouflage gold." She burst out laughing. "Pizza? Bill, it's a birthday party, not a frat house. They'll think we're crashing for free cake. And what if the real pizza shows up?" "Then we improvise! 'Oh, sorry folks, double order—here's your complimentary sausage supreme.' Boom, we're in." He mimed tossing an invisible pizza box. Molly shook her head, fighting a smile. "You're insane. But alright, let's map this out. Back door leads to the kitchen—gifts are in the living room, right by the window. We need a five-minute window. You handle the parents, I'll bag the loot." "Parents? With what, knock-knock jokes?" Bill raised an eyebrow. "No, dummy. Flirt with the moms, distract the dads with sports talk. You're charming when you're not burping." Bill puffed out his chest. "Charming? Baby, I'm a regular Casanova. Watch me work." She punched his arm playfully. "Just don't get carried away. Last time you 'charmed' someone, we almost got arrested for bad karaoke." "Fair point. But this haul? Dolls, bikes, video games—we fence 'em quick, retire to Barbados." Molly nodded, excitement bubbling. "Barbados it is. Alright, partner, let's make these kids' worst nightmare our payday." As the piñata swung in the distance, they shared a conspiratorial wink, engines of mischief revving silently.
created_at
"2026-02-20T19:27:47.45Z"
2/20/2026, 7:27:47 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
5886
moderation_latency
(null)
generation_time
27315
tokens_prompt
77
tokens_completion
802
native_tokens_prompt
744
native_tokens_completion
926
native_tokens_completion_images
0
native_tokens_reasoning
220
native_tokens_cached
682
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0145875
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"37a8105e-8014-601b-570c-4091214a41ec"
is_byok
false
latency
763
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771615667-WxhW4R5Sc6yN9HKjwwGm"
upstream_id
"37a8105e-8014-601b-570c-4091214a41ec"
total_cost
0.0145875
cache_discount
0.0015345
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
515
0.00%
Dialogue to Total Word Ratio
Ratio: 70.10%, Deviation: 40.10%
neededClean
false
wordsTotal
515
wordsDialogue
361
30.1376%