NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2JSDCH6AV2A10N2X5JSEWY
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
5.4s
Cost:
$0.000086
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
“I swear, if one more person asks me if the snickerdoodles have arsenic in them, I’m going to add some.” “You’d have a line out the door. Free publicity.” “Not the point, Martha. I’m trying to sell baked goods, not become a crime scene.” A sugar-dusted hand nudged a porcelain plate across the counter. “Try the lemon bar. It’ll soothe your soul.” “My soul is beyond lemon. Did you hear what Harold said at the town meeting? That I’m the ‘last person to see Elvira alive’ because I sold her a blueberry muffin at six in the morning. A muffin.” “A very assertive muffin.” “It was a normal muffin!” A pause. The clink of a fork against the plate. “Then why’d you put rosemary in it?” “Because she asked for it! She said the usual was boring. Rosemary, lemon zest, a hint of hazelnut… it was inspired.” “And now she’s dead.” “People die after eating a lot of things. Doesn’t make the baker a suspect.” “No. But it makes the baker a very interesting plot point.” The speaker nodded toward the window, where Chief Butler’s patrol car had just pulled up. “Speaking of which.”
created_at
"2026-04-25T15:05:34.87Z"
4/25/2026, 3:05:34 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
793
moderation_latency
(null)
generation_time
5301
tokens_prompt
81
tokens_completion
315
native_tokens_prompt
65
native_tokens_completion
274
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00008582
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"5e4d7dc4-73ec-49e4-aa8c-b344b537d5c0"
is_byok
false
latency
365
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777129534-r93NqaKlA8wNz0tr3LF4"
session_id
(null)
api_type
"completions"
id
"gen-1777129534-c9mAxFIZzIhitsYRHozG"
upstream_id
"5e4d7dc4-73ec-49e4-aa8c-b344b537d5c0"
total_cost
0.00008582
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)