NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2H8X4YCHH43C0CZC2Q5PKW
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
11.9s
Cost:
$0.000243
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0004%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tapped the blueprint spread across the greasy diner table. “Right, so the Magical Unicorn Palace is a rented hall attached to a bouncy castle emporium. The party’s for a kid named Bartholomew. The presents are stacked under a banner that says ‘Happy 7th, Bart the Fart.’ Security is one mom named Brenda, who is on her third glass of ‘juice box’—it’s wine.” Bill squinted at the crudely drawn star on the paper. “You want us to break into a child’s birthday party. For presents. I do jewelry heists, Molly. I’ve lifted a Fabergé egg from a Russian oligarch. This feels… morally downward.” “Morals don’t pay for your avocado toast, Bill. These are premium toys. I scouted the haul. There’s a limited-edition Holographic Pikachu that goes for six hundred on eBay. Plus a cookie-making robot that squirts frosting. That’s untraceable, high-end gear.” Bill shoved a fry in his mouth. “And how do we get in? We can’t exactly melt through a ceiling vent like we’re in a bad action film. It’s a party for seven-year-olds. They’ll notice two adults in black turtlenecks.” “We go as clowns,” Molly said. “Excuse me?” “It’s brilliant. Brenda’s cousin was supposed to bring a clown named Bubbles. He bailed. There’s a vacancy. We show up half an hour late, say Bubbles’s van had a flat tire, and boom—we’re inside. While I keep the kids busy with balloon animals, you scoop the gift table into a duffel bag I’ll have hidden in the oversized clown pants.” Bill stared at her. “You own oversized clown pants?” “I own everything,” Molly said, unfolding a napkin to reveal a rainbow wig and a red nose. “I prepared.” “This is insane. I can’t do balloon animals. The last thing I made was a swan that looked like a deformed chicken. The kid cried.” “You don’t have to be good. They’re seven. Make a long, twisted dog. It’s fine. The key is the distraction. I’ll also do the face painting. I brought a kit. What’s your face? You look like you’re having a stroke.” Bill rubbed his temples. “Why can’t we just wait until the party ends and grab the gifts from the car? That’s less clown-based humiliation.” “Because Bart the Fart’s mom is a paranoid organizer. She brings all presents inside immediately and locks them in a coat closet until the cake is cut. We have to strike at the height of the sugar rush. That’s when the chaos peaks.” Bill sighed, long and theatrical. “Fine. But I’m not doing the voice. No squeaky clown voice.” “Deal. But you have to wear the wig.” “Why me?” “Because my wig is actually a balaclava in disguise. You get the rainbow one. It smells like bleach and regret.” Bill picked it up with two fingers. “I hate this.” “You’ll love it when we’re splitting the Pikachu money. Now, the plan: I enter first, making balloon disasters. You come in through the back kitchen door, grab the gift sack, and exit through the bathroom window. We meet behind the dumpster at exactly 3:47.” “And if a kid sees me?” “Then you make a balloon dog. And you smile. And you never speak of it again.”
created_at
"2026-04-25T14:39:05.378Z"
4/25/2026, 2:39:05 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
564
moderation_latency
(null)
generation_time
11824
tokens_prompt
77
tokens_completion
903
native_tokens_prompt
67
native_tokens_completion
834
native_tokens_completion_images
(null)
native_tokens_reasoning
52
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0002429
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"dbb66501-6d0b-4fa0-8ddb-a283212de777"
is_byok
false
latency
368
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777127945-ZYIHOD9yOa6bTI8wPHuH"
session_id
(null)
api_type
"completions"
id
"gen-1777127945-11Ssx03LTF0Vc91RuO7S"
upstream_id
"dbb66501-6d0b-4fa0-8ddb-a283212de777"
total_cost
0.0002429
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
533
0.00%
Dialogue to Total Word Ratio
Ratio: 88.31%, Deviation: 18.31%
neededClean
false
wordsTotal
539
wordsDialogue
476
0.0004%