NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-17-01KHNF7TH39683ZNCGT0RZQZ54
Run Details
Model:
Arcee AI: Trinity Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
4.5s
Cost:
$0.000150
Provider:
Clarifai
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
24.48%
Dialogue to Total Word Ratio
(details)
12.2383%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill leans against the wall, tapping afinger against his temple. "Okay, Molly. Operation: Candy Cane Caper. Step one: infiltrate the premises. Step two: locate the vault. Step three: extract the payload. Step four: vanish before the sugar crash hits." Molly adjusts her fake clown nose, grinning. "Step one: we waltz in like we own the place. Step two: we spot the shiny boxes under the tree. Step three: we grab 'em. Step four: we high-tail it before the birthday girl cries." Bill sighs, adjusting his fake mustache. "Molly, it's not a 'high-tail.' It's a tactical withdrawal. And step two involves more than just spotting. We need to bypass the... security." Molly snorts. "Security? It's a kids' party! The biggest threat is a rogue balloon animal. What, you think the clown has a laser grid?" Bill straightens up, serious. "The clown has a whistle. And a very loud voice. And possibly a very large, very angry mother. We need a distraction." Molly snaps her fingers. "Got it! We use the piñata! It's like a distraction bomb made of candy and paper mache!" Bill pinches the bridge of his nose. "Molly, the piñata is the target, not the distraction. We need something to draw attention away from the tree." Molly's eyes light up. "The cake! We make a run for the cake table! Everyone will be looking at the cake! Perfect!" Bill rubs his temples. "Molly, the cake table is exactly where the presents are. We'd be standing right next to them while everyone stares. That's not a distraction, that's a spotlight." Molly pouts. "Fine. Plan B. We use the clown's wig. We put it on the dog. Everyone will be distracted trying to figure out why the dog has a red nose." Bill stares at her. "Molly. The dog is a chihuahua. It will look ridiculous. And no one will care. They'll just think it's a bad costume choice. We need something... more... explosive." Molly grins wickedly. "How about we... distract the clown? With a... surprise?" Bill's eyes narrow. "Molly, what are you suggesting?" Molly winks. "Let's just say... we give the clown a reason to forget about the whistle. For a few crucial minutes." Bill groans, but a reluctant smile tugs at his lips. "Alright. Distraction: clown incapacitation. Payload extraction: proceed. But if we get caught, it's your fault for suggesting we use the clown's dignity as a diversion." Molly salutes. "Consider it done. Operation: Candy Cane Caper, commence!" She starts humming a jaunty tune, completely oblivious to the sheer absurdity of their plan.
created_at
"2026-02-17T09:35:01.267743+00:00"
2/17/2026, 9:35:01 AM
model
"arcee-ai/trinity-mini-20251201"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
71
moderation_latency
(null)
generation_time
4462
tokens_prompt
77
tokens_completion
1164
native_tokens_prompt
70
native_tokens_completion
978
native_tokens_completion_images
(null)
native_tokens_reasoning
511
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00014985
router
(null)
provider_responses
0
id
"a50cc4d3a72a44ec82ad9b578572dd76"
status
200
is_byok
false
latency
71
endpoint_id
"73d82c49-90ae-4223-a860-83ed0810821f"
provider_name
"Clarifai"
model_permaslug
"arcee-ai/trinity-mini-20251201"
api_type
"completions"
id
"gen-1771320896-rquNEM37H7gcn7b0sLNh"
upstream_id
"a50cc4d3a72a44ec82ad9b578572dd76"
total_cost
0.00014985
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Clarifai"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
426
24.48%
Dialogue to Total Word Ratio
Ratio: 79.11%, Deviation: 10.89%
neededClean
false
wordsTotal
426
wordsDialogue
337
12.2383%