NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2EGV0SHEQGZK9H1EM885QH
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
2m 40s
Cost:
$0.000120
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
51.93%
Matches word count
(details)
99.06%
Dialogue to Total Word Ratio
(details)
75.4916%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Okay, so we waltz in, you distract the birthday kid, I grab the haul.” Molly adjusted her fake glasses. “Waltz? Molly, we’re crashing a six-year-old’s party, not a gala.” Bill tugged at his clown wig. “And you’re dressed like a librarian. Subtle.” “Librarians are invisible, Bill. No one questions a woman holding a clipboard. What’s your plan? Honk your nose and ask for a balloon animal?” “That’s genius. I’ll ask for a poodle. While the clown’s twisting, you vacuum the gift table.” “Vacuum? I’m not hauling a vacuum cleaner.” “Figure of speech! Grab the wrapping paper. Jeez. You think the parents will notice if we switch the Nintendo Switch for a box of crayons?” “They’ll notice when the kid cries. We need a decoy gift.” “Like what? A rock?” “Better. A signed photo of me.” “You don’t have a signed photo of you.” “I’ll print one. ‘To Timmy, keep dreaming – Molly the Gift Fairy.’ They’ll be so confused they’ll forget the presents.” Bill honked his nose. “I hate this plan.” “You said that about the bank job. And the museum heist. And that time we stole a prizewinning pig.” “That pig was delicious. Fine. We go in. You distract. I steal. But if I end up doing the chicken dance, you’re buying the next round.” “Deal.”
created_at
"2026-04-25T13:50:59.615Z"
4/25/2026, 1:50:59 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2640
moderation_latency
(null)
generation_time
160227
tokens_prompt
77
tokens_completion
406
native_tokens_prompt
67
native_tokens_completion
394
native_tokens_completion_images
(null)
native_tokens_reasoning
62
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0001197
router
(null)
provider_responses
0
endpoint_id
"934a69f9-bd54-474b-beca-24560f721e12"
id
"361826c14bcd4b5fb3a790edeb1ae34f"
is_byok
false
latency
62
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125059-xTBkXwrIWGCXpVEJOGiK"
session_id
(null)
api_type
"completions"
id
"gen-1777125059-tGRKfakNV680Fc9JeBjb"
upstream_id
"361826c14bcd4b5fb3a790edeb1ae34f"
total_cost
0.0001197
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
51.93%
Matches word count
n/a
neededClean
false
words
216
99.06%
Dialogue to Total Word Ratio
Ratio: 93.12%, Deviation: 3.12%
neededClean
false
wordsTotal
218
wordsDialogue
203
75.4916%