NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYB9XS0NV5KFHGPW6Y4MTTA
Run Details
Model:
Grok 4.1 Fast
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
9.9s
Cost:
$0.000548
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
61.12%
Dialogue to Total Word Ratio
(details)
30.5623%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill squeezed into the back of their rusty van, parked across from the community center. Blueprints of the party hall sprawled across a greasy pizza box, flanked by fake mustaches, clown noses, and a sack of balloon animals. Molly, wiry and wild-eyed, jabbed a finger at the sketch. "Listen up, Bill. Timmy Turner's turning seven tomorrow. Whole gym packed with snot-nosed brats and their guilt-gift-giving parents. Presents piled high under a balloon arch—bikes, tablets, that new drone every kid's yapping about." Bill scratched his stubbled chin, popping a clown nose onto his face. It honked. "Kids' party? Molly, we're pros, not candy-grabbers. What's the take worth?" "Five grand easy, resale value. Designer dolls, gaming consoles—parents drop cash like it's confetti. We hit at peak chaos: 2 PM, piñata time." Bill grinned, inflating a balloon sword. "Entry plan? Can't just waltz in." "Disguises, genius. I'm Magical Molly, you’re Bozo the Buffoon. We crash as hired entertainers. Fake flyer says we're from 'Party Pandemonium Pros.'" "Brilliant. I juggle pies in their faces, you vanish presents with sleight-of-hand. Poof! Gone into the balloon van." Molly snorted. "No pies—you're allergic to cream. Use the squirting flower. Distract the rugrats with animal balloons. Twist 'em into gift bags. Kids'll think it's a game." Bill twisted a balloon giraffe, which promptly popped. "What about the parents? Hovering like vultures." "Seduce 'em with dad jokes. You do the one about the pirate's favorite letter." "Arrr, it's the C they love! Ha! Then I spike the punch with sleepy-time herbs?" Molly slapped his arm. "No drugs, idiot—that's felony territory. Just hype the magic show. I'll saw you in half behind the cake table, really stuffing swag into hidden compartments." "Compartments in my clown pants? Risky. One kid yanks the wrong string—boom, Legos everywhere." "Exactly—cover! Yell 'Magic malfunction!' Kids love mess. We grab the big hauls: that mountain bike, the PlayStation stack. Roll 'em out on dollies disguised as parade floats." Bill nodded, sketching an escape route. "Exit through the kitchen? Service door to the alley. Van idling, engine muffled with socks." "Socks? Your lucky ones? Fine. Timing: Piñata bursts at 2:15. Candy frenzy—ten minutes to load. In and out like ghosts." "Ghost clowns. What if Timmy cries? Or the birthday dad clocks us?" Molly leaned in, eyes gleaming. "Improv. Blame the dog. Everyone loves a scapegoat pup. We split fifty-fifty, retire to that beach hut." Bill honked his nose. "To heisting the kiddie jackpot!" They clinked root beer cans, blueprints crumpling under excited fists. Outside, party prep lights flickered on. Tomorrow's chaos beckoned.
created_at
"2026-02-20T20:19:20.493Z"
2/20/2026, 8:19:20 PM
model
"x-ai/grok-4.1-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
284
moderation_latency
(null)
generation_time
9793
tokens_prompt
77
tokens_completion
793
native_tokens_prompt
216
native_tokens_completion
1054
native_tokens_completion_images
0
native_tokens_reasoning
394
native_tokens_cached
151
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00054755
router
(null)
provider_responses
0
endpoint_id
"9821aa0e-dce3-4c27-9716-161359123d0b"
id
"5bb70c98-9e92-945d-0a17-286bf270abf9"
is_byok
false
latency
64
model_permaslug
"x-ai/grok-4.1-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771618760-RYKkApNHabUms4z799BU"
upstream_id
"5bb70c98-9e92-945d-0a17-286bf270abf9"
total_cost
0.00054755
cache_discount
0.00002265
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
443
61.12%
Dialogue to Total Word Ratio
Ratio: 73.14%, Deviation: 3.14%
neededClean
false
wordsTotal
443
wordsDialogue
324
30.5623%