Run: 2026-02-20-01KHYAHW0C8HY3THPG103GGK1B

Run Details

Model:
Grok 4
Duration:
18.4s
Cost:
$0.008234
Provider:
xAI

Evaluation summary

Score Evaluator
Details Link
86.38% Matches word count (details)
0.00% Dialogue to Total Word Ratio (details)
43.1902% Overall

Model input

User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
In the flickering glow of a single bulb in their cramped motel room, Molly paced like a caged panther, her wild red hair a tangled mess. Bill slouched on the sagging bed, munching on a stale donut, his greasy fingers smudging the blueprints of the suburban McMansion. "This is gold, Bill," Molly hissed, jabbing at the map. "Kids' birthday bash at noon. Presents piled like a pirate's treasure. We slip in as clowns—balloons, face paint, the works." Bill snorted, crumbs flying. "Clowns? Last time I dressed as one, that Chihuahua nearly took my leg off. And what about the parents? Soccer moms with tasers?" Molly grinned, her eyes sparkling with mischief. "That's the beauty! We'll distract 'em with a fake magic show. You juggle, I swipe the gifts into the getaway van parked as the ice cream truck." Bill scratched his beard. "Juggling? I can barely tie my shoes. This better not end with us in juvie jail." "Trust me, it's foolproof," Molly said, rolling up the plans. "By evening, we'll be swimming in stolen Barbies and video games." Outside, a siren wailed distantly, but they laughed it off, plotting their absurd caper amid pizza boxes and dreams of easy riches. The party was tomorrow—time to inflate those balloons.
created_at"2026-02-20T20:06:12.246Z" 2/20/2026, 8:06:12 PM
model"x-ai/grok-4-07-09" 4/7/2009, 12:00:00 AM
app_id182717
external_user(null)
streamedtrue
cancelledfalse
latency6925
moderation_latency(null)
generation_time18317
tokens_prompt77
tokens_completion319
native_tokens_prompt744
native_tokens_completion505
native_tokens_completion_images0
native_tokens_reasoning219
native_tokens_cached699
num_media_prompt(null)
num_input_audio_prompt(null)
num_media_completion0
num_search_results(null)
origin"https://ncbench.com/"
is_byokfalse
finish_reason"stop"
native_finish_reason"completed"
usage0.00823425
router(null)
provider_responses
0
endpoint_id"d40bef9d-6936-4c57-9749-bf7399de4a77"
id"0d6346a6-852c-accb-0122-f14d89434a9e"
is_byokfalse
latency79
model_permaslug"x-ai/grok-4-07-09" 4/7/2009, 12:00:00 AM
provider_name"xAI"
status200
api_type"completions"
id"gen-1771617972-fO9kpiJzmGLanWOOxhrY"
upstream_id"0d6346a6-852c-accb-0122-f14d89434a9e"
total_cost0.00823425
cache_discount0.00157275
upstream_inference_cost0
provider_name"xAI"

Evaluation details

Result Evaluator Details Meta Data
86.38% Matches word count n/a
neededCleanfalse
words211
0.00% Dialogue to Total Word Ratio
Ratio: 50.24%, Deviation: 40.24%
neededCleanfalse
wordsTotal211
wordsDialogue106
43.1902%