NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYAPA4N8K4GC3TF03YFFKY9
Run Details
Model:
Grok 4.1 Fast
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
8.9s
Cost:
$0.000356
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
48.8138%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill huddled in their rusty van outside the suburban cul-de-sac, blueprints of "Timmy's Superhero Bash" spread across the dashboard like a treasure map. Molly, wiry and wild-eyed, chewed a pretzel rod like a cigar. "Bill, it's foolproof," she whispered fiercely. "We crash as piñata repairmen. Kids swarm us, we load the van." Bill, burly with a handlebar mustache glued crooked, snorted. "And the candy? We eat it all?" "Priorities, dummy," Molly snapped. "Presents first—Xbox, bikes, that pony!" They rehearsed: Bill would "accidentally" spill punch, creating chaos. Molly'd swoop in, sacks disguised as gift bags, scooping Lego sets and drones amid shrieking tots. Parents, three margaritas deep, wouldn't notice. "But the clown?" Bill fretted, adjusting his polka-dot bowtie. Molly grinned, painting whiskers on her cheeks. "He quits mid-show. Enter us, heroes!" As dusk fell, horns honked—party time. Bill revved the engine. "To heist or not to heist?" Molly punched his arm. "Heist, you idiot. Candy's ours too." They peeled out, laughter echoing, van sagging under empty sacks dreaming of plush unicorns and stolen joy. The kids wouldn't know what hit 'em—or rather, what vanished.
created_at
"2026-02-20T20:08:37.793Z"
2/20/2026, 8:08:37 PM
model
"x-ai/grok-4.1-fast"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
680
moderation_latency
(null)
generation_time
8742
tokens_prompt
77
tokens_completion
392
native_tokens_prompt
216
native_tokens_completion
670
native_tokens_completion_images
0
native_tokens_reasoning
376
native_tokens_cached
151
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00035555
router
(null)
provider_responses
0
endpoint_id
"9821aa0e-dce3-4c27-9716-161359123d0b"
id
"6782854d-7325-b3fc-0d2b-1fad2ca28a32"
is_byok
false
latency
367
model_permaslug
"x-ai/grok-4.1-fast"
provider_name
"xAI"
status
200
api_type
"completions"
id
"gen-1771618117-d93PkhvGF7urqTTJsewg"
upstream_id
"6782854d-7325-b3fc-0d2b-1fad2ca28a32"
total_cost
0.00035555
cache_discount
0.00002265
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
193
0.00%
Dialogue to Total Word Ratio
Ratio: 31.61%, Deviation: 21.61%
neededClean
false
wordsTotal
193
wordsDialogue
61
48.8138%