NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY86GFPCCQG6Z51SJ2FMNN2
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
28.9s
Cost:
$0.015200
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
21.6892%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly paced the cramped motel room, her wild red curls bouncing like springs. Bill lounged on the bed, fiddling with a crumpled map of the suburban neighborhood. The target: Little Timmy's eighth birthday bash at the McAllister mansion. Balloons, cake, and a pile of presents ripe for the taking. Their motto? Why earn when you can heist? "Alright, Bill, listen up," Molly said, jabbing a finger at the map. "The party's at noon. Kids everywhere, screaming like tiny banshees. We blend in, grab the gifts, and vanish before the piñata bursts." Bill chuckled, scratching his scruffy beard. "Blend in? Molly, you're five-foot-nothing with tattoos that scream 'ex-con.' And me? I look like I wrestle bears for fun. How do we not stick out like sore thumbs at a kiddie party?" "Disguises, genius! I'll go as a clown. Big shoes, red nose—the works. You? A magician. Top hat, cape. We'll dazzle 'em with tricks while we swipe the swag." "A magician? Molly, my only trick is making beer disappear. What if they ask for a show? I'll pull a rabbit out of my hat and it'll bite some kid's finger off!" Molly rolled her eyes. "Improvise! Saw a lady in half—use your ex-wife's photo. Now, entry point: The backyard gate's unlocked during setup. We slip in with the caterers. I overheard the mom's on the phone—vegan cupcakes and gluten-free pizza. Who tortures kids like that?" Bill nodded, eyes lighting up. "Okay, solid. Once inside, the presents are stacked by the bounce house. I'll create a diversion—maybe 'accidentally' pop a balloon animal. Pop! Chaos ensues. Kids cry, parents freak, and you scoop the loot into our giant gift bag disguised as a prop." "A diversion? Bill, last time you 'diverted,' you set off the sprinklers and we ended up soaked like drowned rats. Remember the museum job? You tripped the alarm trying to moonwalk out." "That was art! This is kid stuff. Easy peasy. What's the haul? Video games, bikes, maybe some cash envelopes from rich uncles. We'll fence it all at pawn shops. Retire to Mexico, sipping margaritas." Molly smirked. "Mexico? With our luck, we'd rob a piñata factory and get chased by donkeys. Fine, but getaway plan: We park the van as a 'party rental' truck. Load up, drive off casual-like. No squealing tires—act like we're delivering more cake." Bill leaned forward. "What about security? Rich folks got cameras. Dogs?" "No dogs—kid's allergic. Cameras? Hack 'em with that gadget you stole. Loop the footage of happy families. Boom, invisible thieves." "You're a maniac, Molly. But I love it. This could be our big score. Stealing from babies—poetic justice after all those Christmases we got coal." Molly laughed. "Babies? These are spoiled brats. We're the real heroes. redistribution of wealth, heist-style. Now, let's rehearse. You be the magician: 'Abracadabra, presents be gone!'" Bill waved an imaginary wand. "Poof! And if it goes south? Plan B: Cry like kids and blame the clown." They high-fived, grins wide. The heist was on—absurd, risky, and utterly hilarious.
created_at
"2026-02-20T19:25:02.851Z"
2/20/2026, 7:25:02 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
7522
moderation_latency
(null)
generation_time
28798
tokens_prompt
77
tokens_completion
787
native_tokens_prompt
744
native_tokens_completion
976
native_tokens_completion_images
0
native_tokens_reasoning
245
native_tokens_cached
743
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01520025
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"6155b2ad-52df-e38e-e754-9c095808ec95"
is_byok
false
latency
185
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771615502-oX0emggsjXjg7EMB0Wye"
upstream_id
"6155b2ad-52df-e38e-e754-9c095808ec95"
total_cost
0.01520025
cache_discount
0.00167175
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
517
0.00%
Dialogue to Total Word Ratio
Ratio: 79.50%, Deviation: 29.50%
neededClean
false
wordsTotal
517
wordsDialogue
411
21.6892%