NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY7W5TJ6AJYPBQ1JVABYB78
Run Details
Model:
Grok 4
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
34.0s
Cost:
$0.014920
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slumped into the creaky armchair of their rundown apartment, a crumpled blueprint of the suburban McMansion spread across the coffee table. Bill paced back and forth, munching on a stale donut, his eyes gleaming with that manic spark that always preceded their wildest schemes. "Alright, Molly, listen up," Bill said, wiping powdered sugar from his chin. "This kids' party is our golden ticket. Little Timmy's turning eight, and his parents are loaded. Presents piled high—video games, bikes, maybe even that new drone I've been eyeing. We swoop in, grab 'em all, and vanish like ghosts." Molly arched an eyebrow, sipping her lukewarm coffee. "Ghosts? At a children's party? Bill, have you lost your marbles? Those brats will be everywhere, hyped on sugar and screaming. How do we even get inside without looking like total creeps?" Bill grinned, holding up a clown wig he'd pilfered from a thrift store. "Disguises, baby! We go as entertainers. I'll be Bongo the Clown, juggling pins while you, uh, Magic Molly, pull rabbits out of hats. Distract the kids, then bam—straight to the gift table." "Rabbits? Where am I supposed to get a rabbit? And clowns? Kids hate clowns these days. One wrong honk, and we're mobbed by toddlers with water guns." Molly leaned forward, poking at the blueprint. "Look, the party's in the backyard. Tent, bouncy house, the works. Entrance through the side gate. But security? Those rich folks probably have cameras everywhere." Bill waved her off. "Cameras? Pfft. I've got my jammer gadget from that last job. Zaps 'em dead for fifteen minutes. We time it right—during the cake-cutting chaos. Kids are distracted, parents are tipsy on punch. You sneak the presents into black trash bags, label 'em as 'party trash,' and I wheel 'em out to the van parked down the block." Molly chuckled, shaking her head. "Party trash? Genius, until some nosy neighbor spots us. 'Hey, why's the clown hauling garbage?' And what about the dog? Blueprint shows a guard dog—big one, probably slobbers more than you after a beer." "Easy peasy," Bill shot back. "I bribe it with steaks. Dogs love me. Remember that heist at the vet's? Fido was my best pal by the end." "Yeah, until he bit your ankle," Molly retorted, smirking. "Fine, steaks it is. But contingency plan? If things go south, what's our exit?" Bill struck a dramatic pose. "The bouncy house! We dive in, bounce our way to the fence, flip over like acrobats. Kids think it's part of the show. 'Yay, clown Olympics!'" Molly burst out laughing. "You're insane, Bill. Clown Olympics? Alright, I'm in. But if we get caught, you're explaining to the cops why we're dressed like rejects from a circus nightmare." "Deal," Bill said, high-fiving her. "This'll be the heist of the century. Who knew robbing kids could be so fun?" As they huddled over the blueprint, plotting every ridiculous detail, the apartment filled with their scheming whispers and muffled giggles. The plan was absurd, but in their line of work, absurdity was just another word for success.
created_at
"2026-02-20T19:19:24.257Z"
2/20/2026, 7:19:24 PM
model
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
9474
moderation_latency
(null)
generation_time
33935
tokens_prompt
77
tokens_completion
783
native_tokens_prompt
744
native_tokens_completion
948
native_tokens_completion_images
0
native_tokens_reasoning
239
native_tokens_cached
681
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01491975
router
(null)
provider_responses
0
endpoint_id
"d40bef9d-6936-4c57-9749-bf7399de4a77"
id
"74b3ec26-0f8e-007c-747e-fcedef781a8a"
is_byok
false
latency
76
model_permaslug
"x-ai/grok-4-07-09"
4/7/2009, 12:00:00 AM
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771615164-O2xHEePl1ibNoo6P5Qdd"
upstream_id
"74b3ec26-0f8e-007c-747e-fcedef781a8a"
total_cost
0.01491975
cache_discount
0.00153225
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
515
0.00%
Dialogue to Total Word Ratio
Ratio: 71.46%, Deviation: 21.46%
neededClean
false
wordsTotal
515
wordsDialogue
368
30.1376%