NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY99NG3B0EWV3JH4FKFCBBX
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
37.6s
Cost:
$0.015577
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0424%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly paced the cramped motel room, her wild red hair tied back in a ponytail that screamed "disguise fail." Bill slouched on the bed, munching on a stale doughnut, his round belly jiggling with every chew. Maps and crumpled notes littered the floor, detailing their latest scheme: crashing a kid's birthday bash at the swanky McMansion on Elm Street to swipe every last present. "Alright, Bill, listen up," Molly said, jabbing a finger at the blueprint she'd sketched on a napkin. "The party's tomorrow at noon. Balloons, cake, the works. We go in as clowns—big shoes, red noses, the whole shebang. Nobody suspects clowns at a kid's party." Bill snorted, spraying doughnut crumbs. "Clowns? Molly, you know I look like a balloon animal gone wrong in that getup. Last time I dressed as one, the kids threw pies at me. Real pies! What if they recognize us?" "Recognize us? We're master thieves, not celebrities! Besides, your beer gut makes a perfect pillow for the getaway bag. We'll stuff the presents in there—voila, instant Santa belly." Bill wiped his mouth with his sleeve. "Santa? It's July! These rich folks are throwing a 'summer solstice soiree' for little Timmy or whatever. Presents galore—video games, bikes, that new drone toy. But clowns? Kids hate clowns these days. Too creepy. What if we go as magicians? I could pull a rabbit out of a hat while you nab the loot." Molly rolled her eyes. "Magicians? You'd probably pull out a parking ticket instead. No, clowns it is. We arrive, do a little routine: 'Hey kids, who wants to see a magic trick?' Then bam—smoke bomb distraction. I grab the gift table, you herd the parents with balloon animals. Easy peasy." "Smoke bomb? Last time you used one, we ended up in a laundromat coughing up glitter for a week! And herding parents? They're not sheep, Mol. What if one of 'em's a cop? Or worse, a karate mom?" "Karate mom? Bill, you're paranoid. Look, the house has a back entrance through the garden. We park the van nearby, disguised as an ice cream truck. Who doesn't love ice cream at a party?" Bill grinned slyly. "Ice cream truck? Now you're talking. I could eat a cone while we wait. But seriously, what about the dog? Blueprints say they've got a guard poodle named Fluffy. One bark and we're toast." "Fluffy? Please. We'll bribe it with treats. Dogs love thieves— we're basically walking snack machines. Okay, timeline: Noon arrival, 12:15 distraction, 12:30 getaway with sacks of swag. Split the profits 50-50, minus my finder's fee." "Finder's fee? You always pull that! Fine, but if this flops and we end up juggling for pennies, it's on you." Molly laughed, slapping his back. "Deal. Now, practice your clown laugh. Ho ho ho—no, that's Santa again. Make it goofy!" Bill cleared his throat. "Hee hee hee! How's that?" "Like a hyena on helium. Perfect!" As they rehearsed, the room echoed with their bickering, the absurdity of their plan hanging like a piñata ready to burst. Little did they know, the real heist was staying out of jail with ideas this half-baked.
created_at
"2026-02-20T19:44:14.862Z"
2/20/2026, 7:44:14 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
7377
moderation_latency
(null)
generation_time
37546
tokens_prompt
77
tokens_completion
788
native_tokens_prompt
744
native_tokens_completion
992
native_tokens_completion_images
0
native_tokens_reasoning
251
native_tokens_cached
682
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0155775
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"e92aeabf-dafa-c150-ab24-d8339c5e3eed"
is_byok
false
latency
101
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771616654-uWUcQ0Txa9wlSogCx9Xi"
upstream_id
"e92aeabf-dafa-c150-ab24-d8339c5e3eed"
total_cost
0.0155775
cache_discount
0.0015345
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
529
0.00%
Dialogue to Total Word Ratio
Ratio: 73.35%, Deviation: 23.35%
neededClean
false
wordsTotal
529
wordsDialogue
388
0.0424%