NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9ZAQ8YHKBGCJVJ4HPDY69
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
38.5s
Cost:
$0.014920
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
57.61%
Dialogue to Total Word Ratio
(details)
28.8066%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
In the dimly lit backroom of a greasy diner, Molly and Bill hunched over a crumpled napkin sketch of the McAllister mansion. Molly, with her wild red curls and a perpetual smirk, jabbed at the drawing with a french fry. Bill, lanky and nervous, adjusted his fake mustache that kept slipping. "Alright, Bill, listen up," Molly said, her voice a conspiratorial whisper. "The party's tomorrow at noon. Kids everywhere, balloons, cake—the perfect cover. We slip in as clowns. I've got the costumes: big shoes, red noses, the works." "Clowns? Molly, you know I hate clowns," Bill whined, fidgeting with his coffee mug. "Last time I dressed as one, that kid pulled my wig off and screamed bloody murder. Thought I was a monster." Molly rolled her eyes. "That's because you are a monster without coffee. Focus! The presents are piled in the living room, under that giant Christmas tree—even though it's July. Rich folks and their themed parties. We grab 'em during the piñata smash. Chaos ensues, kids swinging bats like tiny lunatics." Bill leaned in, eyes wide. "Piñata? What if one of those bats clocks me? I'm not built for dodgeball with sugar-high munchkins." "Relax, you big baby," Molly chuckled, sketching a stick figure dodging candy. "You'll be the distraction. Honk your nose, juggle some balls—metaphorical ones—and I'll load the sacks. I've rigged a clown car outside. Tiny on the outside, but it'll hold a mountain of toys." Bill snorted. "A clown car? Where'd you get that? And sacks? We're stealing presents, not Santa's rejects." "eBay, genius," Molly shot back. "And yes, sacks—festive ones, red with white trim. Blend right in. Now, the escape: We bolt when the birthday song starts. Everyone's singing, no one's watching the gift pile." "But what about the parents?" Bill asked, voice rising. "They're not blind. One might spot us and call the cops. 'Officer, two clowns just nicked the Nerf guns!'" Molly grinned wickedly. "That's why we plant the itching powder in the punch. Non-lethal, just enough to make 'em scratch like they've got fleas. While they're dancing the itch jig, we're out the door." "Itching powder? Molly, you're diabolical," Bill laughed, slapping the table. "But what if a kid drinks it? We can't have allergic reactions on our conscience." "Bill, you worrywart," she sighed. "It's adult punch only—spiked with boredom. Kids get juice. Now, contingencies: Guard dog? Nope, they have a goldfish. Alarms? Bypassed with my gadget—looks like a whoopee cushion, works like a charm." Bill nodded slowly. "Okay, but promise me no balloon animals. Last heist, I twisted a giraffe that looked like a sausage dog. Kid cried for hours." "Fine, no balloons," Molly agreed, crumpling the napkin. "Just presents. We're in, out, and rich by sunset. Think of it: Selling those toys online. We'll be legends—the Party Poachers!" Bill raised his mug. "To the heist of the century. Or at least the birthday." Molly clinked her fry against it. "Cheers. Now, practice your clown laugh. Make it creepy but not too creepy." As Bill honked out a ridiculous guffaw, the diner patrons stared, but the duo was already lost in their plot, dreaming of pilfered presents.
created_at
"2026-02-20T19:56:04.727Z"
2/20/2026, 7:56:04 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
7550
moderation_latency
(null)
generation_time
38396
tokens_prompt
77
tokens_completion
830
native_tokens_prompt
744
native_tokens_completion
948
native_tokens_completion_images
0
native_tokens_reasoning
193
native_tokens_cached
681
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01491975
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"563e4118-76a7-0cdd-430e-cc8f951e3daa"
is_byok
false
latency
68
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771617364-DTnUsVDZEckuP0aV8g9Y"
upstream_id
"563e4118-76a7-0cdd-430e-cc8f951e3daa"
total_cost
0.01491975
cache_discount
0.00153225
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
536
57.61%
Dialogue to Total Word Ratio
Ratio: 73.32%, Deviation: 3.32%
neededClean
false
wordsTotal
536
wordsDialogue
393
28.8066%