NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-26-01KJCS1F8EM4947FSKXS0QEHNP
Run Details
Model:
LFM2 24B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
15.3s
Cost:
$0.000090
Provider:
Together
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
22.14%
Dialogue to Total Word Ratio
(details)
11.0721%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The living room buzzed with the muffled giggles of kids at the birthday bash next door. Molly leaned against the doorway, arms crossed, eyes darting between the cake table and the window. Bill crouched beside her, clutching a flashlight like it was a weapon. “Alright,” Molly said, “we’re in and out before anyone notices. The kids’ll be too busy chasing balloons to look up.” Bill adjusted his glasses, squinting at the darkened hallway. “Easy for you to say. You’ve never had to sneak past a room full of screaming toddlers.” “Oh, I’ve got this,” she replied, grinning. “Remember the plan? I’ll distract them with my ‘magic show’—the one where I make rubber chickens disappear. Meanwhile, you slip into the kitchen, grab every present, and vanish before Mom checks the gifts.” Bill snorted. “Rubber chickens? That’s your big move? Last time you ‘distracted’ Mrs. Jenkins with a chicken, she nearly called the cops.” Molly rolled her eyes. “That was a *controlled* distraction. Besides, kids love chickens. They’ll chase me anywhere.” She paused, then added, “But seriously, what if someone sees us?” “Then we blame the dog,” Bill said, nodding toward the neighbor’s chihuahua barking outside. “Or the cat. Either works.” “Too risky,” Molly countered. “The cat’s allergic to chaos.” They both laughed, but the sound died when Molly’s phone buzzed. A text lit up: *Mom just asked if the cake is ready.* “Ugh, she’s on her way,” Bill muttered. “We’ve got maybe ten minutes before she checks the gifts.” “Ten minutes is enough,” Molly said, already moving toward the window. “I’ll climb up, peek through the blinds. If the room’s empty, we go in.” Bill hesitated. “What if she comes back early? What if there’s a security camera?” “Then we improvise,” Molly said, climbing onto the windowsill. “Like last time. Remember the museum heist? We used the janitor’s cart.” “That involved a mop bucket,” Bill groaned. “This is a *birthday party*. No mop buckets here.” “Exactly,” Molly said, sliding down. “We’ll need something… less obvious. Maybe a balloon?” Bill blinked. “A balloon?” “Yeah! Kids love balloons. I’ll pop one near the door, create a diversion. You slip in while they’re distracted.” “So we’re throwing confetti at our own heist?” “Think of it as performance art,” Molly said, adjusting her gloves. “Art that involves stolen presents.” Bill sighed. “Fine. But if we get caught, I’m blaming the rubber chickens.” “Deal,” she said, already heading back to the window. “Now, grab the flashlight. And don’t trip over your own feet this time.” Bill groaned again but followed, muttering about chickens and balloons. The kids’ laughter echoed louder now, and somewhere in the chaos, Molly smiled. The real fun was just beginning.
created_at
"2026-02-26T10:48:45.415Z"
2/26/2026, 10:48:45 AM
model
"liquid/lfm-2-24b-a2b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
281
moderation_latency
(null)
generation_time
14940
tokens_prompt
77
tokens_completion
799
native_tokens_prompt
75
native_tokens_completion
730
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00008985
router
(null)
provider_responses
0
endpoint_id
"b5f259c3-1e77-40d6-8fa4-161cdf0e4a7c"
id
"oYufhdV-2j9zxn-9d3edf33f7fae516"
is_byok
false
latency
281
model_permaslug
"liquid/lfm-2-24b-a2b-20260224"
provider_name
"Together"
status
200
api_type
"completions"
id
"gen-1772102925-d9WZJsUOOlVBN7KmM94f"
upstream_id
"oYufhdV-2j9zxn-9d3edf33f7fae516"
total_cost
0.00008985
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Together"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
448
22.14%
Dialogue to Total Word Ratio
Ratio: 64.51%, Deviation: 5.49%
neededClean
false
wordsTotal
448
wordsDialogue
289
11.0721%