Run: 2026-02-20-01KHY7XV5XMWDAPTC5JHRNA25Z

Run Details

Duration:
6.8s
Cost:
$0.000300
Provider:
xAI

Evaluation summary

Score Evaluator
Details Link
6.09% Matches word count (details)
57.22% Dialogue to Total Word Ratio (details)
31.6542% Overall

Model input

User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
Molly leaned over the grimy kitchen table, jabbing a finger at a crayon-scribbled map. "Listen up, Bill. The Johnsons' backyard bash tomorrow—twenty snot-nosed rugrats, pile of presents taller than me. We snag 'em all." Bill scratched his stubbled chin, eyes lighting up. "Presents? Like bikes and Barbies? Sweet score! How we crashin' without the kiddie cops wailin'?" "Disguises, dummy. I'm the balloon lady—big red wig, squeaky nose. You? Party magician. Black cape, top hat stuffed with gift bags." Bill grinned maniacally. "Abracadabra, presents vanish! But the cake? We grab that too?" Molly snorted. "Cake's for diversion. Smear it on the birthday boy, chaos erupts. We scoop loot into my van disguised as the pony ride trailer." "What if pony kicks us? Or some mini-Sherlock spots the swag?" "I'll trip the brat with my clown shoe. You sleight-of-hand the big teddy. In and out in ten—rich off Elmo dolls by midnight!" Bill fist-pumped. "To heisting the kiddie jackpot!" Molly smirked. "Shh, or we'll be stealin' coal from Santa next."
created_at"2026-02-20T19:20:18.885Z" 2/20/2026, 7:20:18 PM
model"x-ai/grok-4.1-fast"
app_id182717
external_user(null)
streamedtrue
cancelledfalse
latency617
moderation_latency(null)
generation_time6649
tokens_prompt77
tokens_completion344
native_tokens_prompt216
native_tokens_completion559
native_tokens_completion_images0
native_tokens_reasoning289
native_tokens_cached154
num_media_prompt(null)
num_input_audio_prompt(null)
num_media_completion0
num_search_results(null)
origin"https://ncbench.com/"
is_byokfalse
finish_reason"stop"
native_finish_reason"completed"
usage0.0002996
router(null)
provider_responses
0
endpoint_id"9821aa0e-dce3-4c27-9716-161359123d0b"
id"85eb7161-f8b9-9def-3f07-223bfe38273b"
is_byokfalse
latency401
model_permaslug"x-ai/grok-4.1-fast"
provider_name"xAI"
status200
api_type"completions"
id"gen-1771615218-cQGRnyiSGr34uPvXIVwx"
upstream_id"85eb7161-f8b9-9def-3f07-223bfe38273b"
total_cost0.0002996
cache_discount0.0000231
upstream_inference_cost0
provider_name"xAI"

Evaluation details

Result Evaluator Details Meta Data
6.09% Matches word count n/a
neededCleanfalse
words177
57.22% Dialogue to Total Word Ratio
Ratio: 81.36%, Deviation: 8.64%
neededCleanfalse
wordsTotal177
wordsDialogue144
31.6542%