NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY89S4G9WJ9KMBCNHD2334V
Run Details
Model:
Grok 4.1 Fast
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
15.7s
Cost:
$0.000508
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
96.20%
Dialogue to Total Word Ratio
(details)
48.0995%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly hunched over the kitchen table in their cramped apartment, a crumpled party invitation spread out like a treasure map. Pinatas, balloons, and a scrawled "Happy 7th Birthday, Timmy!" stared back at her. Bill paced, gnawing on a stale pretzel, his eyes darting nervously. "Bill, this is gold," Molly said, slapping the invite. "A kids' party at the McMansion on Elm Street. Rich parents, spoiled brat—presents piled high. We're talkin' iPads, bikes, Lego castles. We swipe 'em all, fence for thousands." Bill stopped pacing, crumbs tumbling from his mustache. "Kids' party? Molly, those rugrats are feral. They'll swarm us like piranhas. And the parents? Helicopter moms with wine glasses sharper than switchblades." She grinned, pulling out a clown wig from a duffel bag. "That's why we go undercover. I'm Bouncy the Clown—juggle, honk nose, distract the wee beasts. You? Superhero sidekick. Cape, mask, the works. We crash at 2 PM, peak chaos: cake smashing, piñata bashing." Bill snorted, trying on a rubber dinosaur mask. It muffled his voice. "D-dinosaur? I look like a rejected Muppet. What if Timmy wants a piggyback? Or worse, his dad grills me on the League of Justice lore?" "Improvise!" Molly barked, sketching on a napkin. "Entry: back gate by the bounce house—cut the chain with bolt cutters disguised as party scissors. Phase one: balloon animals filled with knockout gas? Nah, too risky. Just spike the punch with sleepy-time herbs." Bill ripped off the mask. "Herbs? We're thieves, not witches! Kids puke rainbows, parents call cops. Real plan: I rig the sound system to blast 'Baby Shark' on loop—total meltdown. While they're covering ears, you stuff presents in a giant gift bag labeled 'Party Favors for All!' Genius." Molly cackled, high-fiving him. "Yes! Exit via the bouncy castle—inflate it massive, create a diversion bounce riot. We tumble out the side flap into the van, presents bouncing like happy loot." "But the dog," Bill groaned, pointing at the invite's doodle. "Fluffy the Guard Terrier. It'll yap apocalypse." "Doggy treats laced with—" "No more drugs!" Bill yelped. "Peanut butter on the muzzle. Simple. Then what? Pawn shop says no kids' stuff—too traceable." Molly winked, unveiling a stack of fake labels. "Relabel as 'Estate Sale Finds.' Bikes become 'Vintage Cycles.' Done. We're in, out in 20 minutes. Richest score since that bakery caper." Bill grinned finally, donning the cape. "Fine. But if a kid barfs on me, you're driving with puke headlights." "Deal," she laughed. "To Timmy's doom!" They clinked beer cans, blueprints fluttering like confetti. Outside, kids' laughter echoed from the street—practice for the heist of the century. Or hilarious disaster.
created_at
"2026-02-20T19:26:50.011Z"
2/20/2026, 7:26:50 PM
model
"x-ai/grok-4.1-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1956
moderation_latency
(null)
generation_time
15611
tokens_prompt
77
tokens_completion
777
native_tokens_prompt
216
native_tokens_completion
994
native_tokens_completion_images
0
native_tokens_reasoning
335
native_tokens_cached
215
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00050795
router
(null)
provider_responses
0
endpoint_id
"9821aa0e-dce3-4c27-9716-161359123d0b"
id
"cf553163-59fa-dca8-3e3b-8b47ea5be13e"
is_byok
false
latency
1769
model_permaslug
"x-ai/grok-4.1-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771615610-WQTKes6pBdkzzYvFOZow"
upstream_id
"cf553163-59fa-dca8-3e3b-8b47ea5be13e"
total_cost
0.00050795
cache_discount
0.00003225
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
443
96.20%
Dialogue to Total Word Ratio
Ratio: 70.88%, Deviation: 0.88%
neededClean
false
wordsTotal
443
wordsDialogue
314
48.0995%