NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2HM5ZKGYW6DA7SW19BJ8EN
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
22.9s
Cost:
$0.000255
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly squinted at the inflatable bouncy castle, her nylon jacket crinkling as she shifted behind the hedge. “Right. So we’ve got two—maybe three—minutes before the entertainer brings out the cake. That’s our window.” Bill tugged at the collar of his borrowed clown suit. “I still don’t get why I’m dressed like a giant fruit loop.” “Because you’re the decoy, Bill. Kids trust a clown. They don’t trust a man wearing a balaclava and smelling of gin.” “I don’t smell of gin.” “You absolutely smell of gin. You smell like a distillery had a fistfight with a bakery and the bakery lost.” Bill huffed, adjusting the oversized red nose. “Fine. So I do the balloon animals, they all gather round, and you… what exactly?” Molly pulled a schematic from her pocket—it was a crayon drawing of the party layout, which she had bribed from a five-year-old for a lollipop. “While you’re twisting poodles, I slip through the side gate, cut the ribbon on the present table, and fill the sack.” “The present table’s right next to the mum with the camcorder.” “The mum with the camcorder is currently crying because her son just threw up on the vegan cupcakes. I timed it. She’ll be in the kitchen for at least four minutes.” Bill squinted at her. “You timed a child’s vomiting?” “I’m a professional, Bill. I also noted that the birthday boy’s name is Barnaby, which is, in my professional opinion, a crime in itself. Who names a child Barnaby? It sounds like a Victorian butler who eats glue.” “Focus, Molly. What if the entertainer sees you?” “The entertainer is a nineteen-year-old drama student named Chad who’s already three Jell-O shots deep. He won’t see a thing. And even if he does, I’ll just say I’m the aunt from New Zealand.” “You don’t sound like you’re from New Zealand.” “I’ll do a bad Australian accent. They never check.” Bill sighed, the squeaky honk of his red nose punctuating the air. “And the getaway?” “The inflatable slide. We slide down, out through the back fence. There’s a van waiting labeled ‘Discount Dental Supplies.’ No one questions a dental van.” “We don’t have a van.” “We will have a van. I’ve got a man named Kevin on standby. Kevin owes me a favor because I didn’t turn him in for the Great Rubber Ducky Heist of last March.” “That was a pet store, Molly. You stole a thousand rubber ducks. They still talk about it on local news.” “Exactly. I’m legendary. Now, you remember the signal?” “When I hear the word ‘unicorn,’ I start juggling.” “No, you numpty. When I yell ‘Unicorn!’ it means the mother’s coming back. You fake a collapse. Heart attack. Big drama. Draw all the attention.” “I’m not faking a heart attack in a clown suit. That’s undignified.” “You’re a thief in a clown suit, Bill. Dignity left the building when you agreed to wear those size-18 floppy shoes. Now, let’s go. Barnaby’s presents aren’t going to steal themselves.” Bill stared at the bouncy castle, the screech of children a war cry in the distance. “This is the daftest plan we’ve ever had.” “Which is exactly why it’ll work. Nobody expects two idiots to rob a birthday party.” She patted his cheek. “Now go, be the clown you were always meant to be.”
created_at
"2026-04-25T14:45:14.871Z"
4/25/2026, 2:45:14 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
641
moderation_latency
(null)
generation_time
12707
tokens_prompt
77
tokens_completion
968
native_tokens_prompt
67
native_tokens_completion
876
native_tokens_completion_images
(null)
native_tokens_reasoning
57
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00025466
router
(null)
provider_responses
0
endpoint_id
"934a69f9-bd54-474b-beca-24560f721e12"
is_byok
false
latency
48
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepInfra"
status
(null)
1
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"96109ca1-8ddd-426d-849c-00102c9d95a1"
is_byok
false
latency
356
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777128314-e49tGhaYY8nOo8eIUfUA"
session_id
(null)
api_type
"completions"
id
"gen-1777128314-OqQ2wKrDoCjQ79amFfD1"
upstream_id
"96109ca1-8ddd-426d-849c-00102c9d95a1"
total_cost
0.00025466
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
557
0.00%
Dialogue to Total Word Ratio
Ratio: 82.59%, Deviation: 52.59%
neededClean
false
wordsTotal
563
wordsDialogue
465
0.0000%