NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5ZB992D5V43QCF0FA8DEJ
Run Details
Model:
DeepSeek V3 (2025-03-24)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
8.8s
Cost:
$0.000316
Provider:
SiliconFlow
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Alright, Bill, let’s go over the plan one more time,” Molly said, crouched behind a bush outside the brightly lit party venue. “Easy. We wait for the clown to leave his van unattended, then we knock him out and take his costume,” Bill said, adjusting his black beanie. “No, no, no. The clown’s the groom’s brother. He’s not leaving. We grab the magician instead.” “But the magician’s got that tiny assistant—she’ll notice!” “Then we threaten her with a rubber chicken.” Bill blinked. “A rubber chicken.” “It’s intimidating if you wave it right.” Molly pulled one from her bag and shook it menacingly. “Fine. So we’re a magician and his assistant. Then what?” “The presents are in the backyard under the piñata. We distract the kids—” “How?” “You pull a rabbit out of your hat.” “I don’t know how to do that!” “Just grab one from the petting zoo and shove it in there. Kids won’t notice.” Bill groaned. “What if the parents catch us?” Molly grinned. “We blame the clown. Classic misdirection.” Bill paused. “You’re terrifying.” “Thank you. Now let’s go steal some Legos.”
created_at
"2026-02-16T21:33:56.07043+00:00"
2/16/2026, 9:33:56 PM
model
"deepseek/deepseek-chat-v3-0324"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2075
moderation_latency
(null)
generation_time
8691
tokens_prompt
77
tokens_completion
341
native_tokens_prompt
66
native_tokens_completion
299
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0003155
router
(null)
provider_responses
0
id
"019c685fae35fe4768a9d5c559bc9ee4"
status
200
is_byok
false
latency
2075
endpoint_id
"d562908f-c5eb-4a32-9a91-24d53b176fd8"
provider_name
"SiliconFlow"
model_permaslug
"deepseek/deepseek-chat-v3-0324"
api_type
"completions"
id
"gen-1771277626-QnYdmq51Jq0qFFb4fJZq"
upstream_id
"019c685fae35fe4768a9d5c559bc9ee4"
total_cost
0.0003155
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
Evaluation details
Result
Evaluator
Details
Meta Data
60.2752%
Matches word count
n/a
neededClean
false
words
185
44.9018%
Dialogue to Total Word Ratio
Ratio: 80.54%, Deviation: 9.46%
neededClean
false
wordsTotal
185
wordsDialogue
149
52.5885%