NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5M95XTTDDXVRXFV4JMKMM
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
49.9s
Cost:
$0.000798
Provider:
SiliconFlow
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The air in the tiny, grimy apartment was thick with the smell of stale coffee and desperation. Spread across the chipped Formica table was the most unorthodox blueprint Molly had ever seen: a brightly colored hand-drawn map of the local community center, with "Lucas's Epic 7th Birthday!" scrawled across the top in crayon. Bill, a man built like a regretful refrigerator, pointed a thick finger at a squiggle marked 'PINATA ZONE'. "This is the key, Moll. The key! Distraction, chaos, sugared-up little monsters... it's the perfect storm." Molly, whose slender frame was coiled with the tension of a startled cat, massaged her temples. "Bill, look at me. We have successfully liberated vintage Rolexes from a pressure-sensitive vault. We have outrun Alsatians in the moonlit gardens of Russian oligarchs. And now you're proposing we hit a seven-year-old's birthday party?" "It's a target-rich environment!" Bill insisted, his voice a low rumble. "And the security is nonexistent. One guy in a cheap clown suit who I'm pretty sure is just Frank from the laundromat." "Our intel suggests Frank is a surprisingly observant clown," Molly countered, tapping a surveillance photo of the clown making a balloon animal with intense focus. "Look at that swan. That's the work of a man with nothing to lose." "Okay, fine. We go in during the 'Musical Chairs' meltdown. Emotions are high. Parents are filming, not paying attention. We pose as... disgruntled relatives." "And our cover? What's our story? We're the weird cousins from out of town who nobody likes but were invited out of pity?" "Exactly!" Bill beamed. "See, you're a natural. We'll bring a disappointing present—maybe a moderately challenging jigsaw puzzle—to sell it. We blend into the background, which in this case is a sea of juice-stained t-shirts and existential dread." Molly sighed, picking up a list of the expected loot. "A 'Build-Your-Own Volcano' kit, three separate gift cards for a place called 'The Bouncy Castle Emporium', and..." she squinted, "...a live hamster in a Habitrail. How, exactly, do we fence a live hamster, Bill?" "We keep the hamster," Bill said, his expression turning uncharacteristically soft. "He can be our mascot. Call him 'Lucky'." "He'll be evidence, you sentimental fool!" "Fine, we'll leave the rodent. But the rest is easy. We load the gifts into a giant, festive-looking bag. We're not thieves; we're just the helpful cousins offering to take the trash out." "The trash bag full of brand-new loot," Molly deadpanned. "Precisely! It's audacious. It's so brazen it's brilliant. By the time little Lucas finishes weeping over the loss of his volcano, we'll be halfway to Mexico." Molly stared at the crayon map, at the crude drawing of a cake surrounded by stick figures. A strange smile played on her lips. It was the most preposterous, humiliating, and downright idiotic scheme they had ever concocted. It was also, she had to admit, kind of brilliant. "Alright," she said finally. "But I'm not wearing a party hat. That's a line I will not cross." Bill clapped his massive hands together, a sound like a gunshot in the quiet room. "Deal! Now, let's rehearse our backstory. You hated Uncle Carl's potato salad at the last reunion, remember? Really sell the bitterness."
created_at
"2026-02-16T21:28:34.491075+00:00"
2/16/2026, 9:28:34 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1727
moderation_latency
(null)
generation_time
49879
tokens_prompt
95
tokens_completion
846
native_tokens_prompt
70
native_tokens_completion
779
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0007979
router
(null)
provider_responses
0
id
"019c685a28e1a68a108c927e6006770d"
status
200
is_byok
false
latency
1727
endpoint_id
"5d5c9c1d-af0d-4e4a-8aee-c5cd2b3883f3"
provider_name
"SiliconFlow"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277264-eoELh9FzLuuhpJf3882w"
upstream_id
"019c685a28e1a68a108c927e6006770d"
total_cost
0.0007979
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
546
0.0028%
Dialogue to Total Word Ratio
Ratio: 64.47%, Deviation: 14.47%
neededClean
false
wordsTotal
546
wordsDialogue
352
0.0014%