Run: 2026-02-17-01KHNEVDGGECQSDDQ5WSXAMNGN

Run Details

Duration:
8.7s
Cost:
$0.000484
Provider:
Fireworks

Evaluation summary

Score Evaluator
Details Link
90.48% Matches word count (details)
36.79% Dialogue to Total Word Ratio (details)
63.6358% Overall

Model input

User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
"We need a plan, Bill. This isn't exactly a bank vault." "Exactly! Kids have zero security. They can't even tie their shoes." "I'm serious. There's going to be clown, a bouncy castle, and like thirty parents hovering everywhere." "Perfect cover. Everyone's distracted watching their little ones bounce around. We blend right in." "Blend in? Bill, we're not exactly dressed for a superhero birthday party." "Which is why we're going as entertainment! I got us jobs as balloon artists." "You can make balloon animals?" "I can make one thing. It's... a dog. Sometimes a giraffe if I squint." "That's the worst backup plan I've ever heard." "Fine, you got a better idea?" "We wait until cake time. All the kids go inside. We grab the gifts left under the tree and slip out the back." "You really think no one watches the presents?" "Parents are obsessed with documenting everything. They won't leave that pile unattended for more than five minutes." "Five minutes is nothing!" "Exactly why we move fast. No hesitation, no extra balloon giraffes." "You're really ruining my vision here, Molly." "Your vision involves a fake mustache and a trench coat. I'm saving us from prison." "Fine. Cake time it is." "Promise me no balloon animals." "I make no promises."
created_at"2026-02-17T09:28:19.118444+00:00" 2/17/2026, 9:28:19 AM
model"minimax/minimax-m2.5-20260211"
app_id182717
external_user(null)
streamedtrue
cancelledfalse
latency3889
moderation_latency(null)
generation_time8642
tokens_prompt77
tokens_completion467
native_tokens_prompt97
native_tokens_completion379
native_tokens_completion_images(null)
native_tokens_reasoning120
native_tokens_cached0
num_media_prompt(null)
num_input_audio_prompt(null)
num_media_completion0
num_search_results(null)
origin"https://ncbench.com/"
is_byokfalse
finish_reason"stop"
native_finish_reason"stop"
usage0.0004839
router(null)
provider_responses
0
id"f1387b82-3cf9-42ee-b924-9edc70be2fd3"
status200
is_byokfalse
latency3780
endpoint_id"c61d37ec-745c-4efd-8129-a68371e056b2"
provider_name"Fireworks"
model_permaslug"minimax/minimax-m2.5-20260211"
api_type"completions"
id"gen-1771320489-qtHItOPHqxto5Wf3Fycf"
upstream_id"f1387b82-3cf9-42ee-b924-9edc70be2fd3"
total_cost0.0004839
cache_discount(null)
upstream_inference_cost0
provider_name"Fireworks"

Evaluation details

Result Evaluator Details Meta Data
90.48% Matches word count n/a
neededCleanfalse
words210
36.79% Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededCleanfalse
wordsTotal210
wordsDialogue210
63.6358%