NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2HFNGW62BWQ6AVCHF0ARD3
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
16.8s
Cost:
$0.000320
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the strap on her binoculars, peering through the kitchen window of the suburban house. A piñata shaped like a unicorn swayed in the backyard, its rainbow streamers fluttering in the breeze. Bill crouched beside her, clutching a map of the property he’d sketched on a napkin. “You sure this is worth it? My mom always said stealing from kids is bad karma.” “Your mom also said your dad’s ‘business trips’ were to sell vacuum cleaners,” Molly said without lowering the binoculars. “This isn’t stealing from kids. This is strategic redistribution of assets. These parents spent four hundred dollars on a bouncy castle alone. There’s a pile of gifts in the living room that could fund our retirement.” “Our retirement?” Bill squinted. “We’ve been partners for six months. You still owe me twenty bucks from that antique shop job.” “And you’ll get it back once we fence these presents. Look—the party’s at peak chaos. The kids are all outside, the parents are drunk on fruit punch, and the grandma is napping in the La-Z-Boy. We slide in through the side door, grab the haul, and we’re out before anyone notices the pile is gone.” Bill chewed on a strand of licorice he’d been saving. “What if the presents are labeled? Like, ‘To Timmy, from Aunt Susan’? That’s personal. We’d be stealing a child’s identity.” “We’re not stealing their identity. We’re stealing their Lego sets and gift cards. And we’ll donate half to that animal shelter. That’s basically charity.” “So we’re Robin Hoods, only we’re robbing toddlers.” “Exactly. Now, on my signal, we move. You take the left flank by the fence. I’ll go through the doggy door. The pug is already bribed.” “You bribed a pug?” “With a piece of bacon from my pocket. He’s our inside man. Or inside dog. Now shut up and get into position.” Bill hesitated, glancing at the map. “The side door has a creaky hinge. I read that on the internet—third step from the top, nine decibel creak. We need lubricant.” “I have WD-40 in my fanny pack. You think I don’t prepare?” “And what about the mom? She’s got that sharp smile. She looks like she’d use a rolling pin on us.” Molly lowered the binoculars and turned to face him, a single eyebrow raised. “She’s currently taking a selfie with a balloon animal. She’s not going to roll anything. Now focus. In three, two—” “Wait,” Bill said, holding up a hand. “What’s the code phrase if we get separated? In case one of us gets caught by a five-year-old and has to negotiate a truce?” Molly sighed. “The code phrase is ‘I found the bathroom.’ If I say that, you run.” “And if I say it?” “Then I run. It’s symmetrical, Bill. Now, for the love of all that is shiny, move.” Bill took a deep breath, then nodded. “One more thing. What’s your exit strategy if the grandma wakes up?” “I’ve got a whoopee cushion. Distraction.” “That’s your plan? A whoopee cushion?” “It’s classic. Now go.” He crept toward the fence, his footsteps muffled by the grass. Molly slid the WD-40 from her pack and crouched low. The pug at the doggy door waggled his tail, bacon grease smeared on his jowls. Inside, the sound of children screaming over a game of musical chairs filled the air. Molly smirked. Easy money. Then Bill’s voice came over their earpiece: “Uh, Molly? The bouncy castle just collapsed. A kid is crying. The party is migrating inside in approximately fifteen seconds.” She froze. “You mean right now?” “I mean now, Molly. Now now.” The back door swung open. A flood of children poured into the kitchen, followed by a mother with a plastic crown. Molly dove behind a bush as the pug let out a thrilled bark and trotted into the house, his tail high. “Abort,” she hissed into the mic. “Repeat, abort. Meet at the dumpster. Bring licorice.” “I’m already here,” Bill’s voice came back, faint and defeated. “I’ve got half a Twizzler and a bad feeling about karma.” Molly crawled backward through the flowerbed, a petunia stem poking her ear. “We’ll try again next weekend. There’s a clown party in the park. I already scouted the balloon-twisting station.”
created_at
"2026-04-25T14:42:46.944Z"
4/25/2026, 2:42:46 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
590
moderation_latency
(null)
generation_time
16687
tokens_prompt
77
tokens_completion
1224
native_tokens_prompt
67
native_tokens_completion
1110
native_tokens_completion_images
(null)
native_tokens_reasoning
69
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00032018
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"6247f321-b43c-4425-8819-ba3e3f83a9f9"
is_byok
false
latency
350
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777128166-90YM6CqUcpcjGDDrMuuB"
session_id
(null)
api_type
"completions"
id
"gen-1777128166-NQVbWoEFGdxAMHxQlKfb"
upstream_id
"6247f321-b43c-4425-8819-ba3e3f83a9f9"
total_cost
0.00032018
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
709
0.00%
Dialogue to Total Word Ratio
Ratio: 67.74%, Deviation: 17.74%
neededClean
false
wordsTotal
716
wordsDialogue
485
0.0000%