NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5QVSD6FBSMZZBFY7PHXJ0
Run Details
Model:
DeepSeek V3 (2025-03-24)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
7.2s
Cost:
$0.000398
Provider:
SiliconFlow
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
**Molly:** You sure this is the right house? **Bill:** Positive. Little Timmy's sixth birthday. Rumor says his parents went all out this year. **Molly:** So our grand heist is robbing a six-year-old? **Bill:** Think bigger. Kids these days get hoverboards, gold-plated Legos, designer diapers— **Molly:** Designer *what*? **Bill:** Rich people, Mol. They’ll diaper their kids in diamonds if they could. Point is, we grab the haul, fence it fast, and retire early. **Molly:** Assuming we don’t get taken down by a hyperactive kindergartener with a juice box. **Bill:** Relax. Kid parties are chaos—balloons, clowns, sugar-fueled meltdowns. No one’ll notice two more grown-ups in silly hats. **Molly:** You’re wearing a silly hat? **Bill:** It’s *disguise*. And *you’re* the clown. **Molly:** *Excuse me?* **Bill:** You got the face paint, the wig— **Molly:** This is my *actual hair*! **Bill:** Even better. Natural camouflage. **Molly:** If this goes wrong, I’m blaming you. And throwing you to the kids. **Bill:** Deal. But if we pull it off, we split the loot fifty-fifty. **Molly:** Sixty-forty. **Bill:** Fifty-five–forty-five. **Molly:** Done. Now let’s go steal some stuffed animals. **Bill:** And remember—act like you belong. **Molly:** Easy. I *always* belong at a piñata massacre.
created_at
"2026-02-16T21:29:49.457147+00:00"
2/16/2026, 9:29:49 PM
model
"deepseek/deepseek-chat-v3-0324"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1352
moderation_latency
(null)
generation_time
7128
tokens_prompt
77
tokens_completion
378
native_tokens_prompt
66
native_tokens_completion
381
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0003975
router
(null)
provider_responses
0
id
"019c685bf0061be1fb244b28844c79a9"
status
200
is_byok
false
latency
1345
endpoint_id
"d562908f-c5eb-4a32-9a91-24d53b176fd8"
provider_name
"SiliconFlow"
model_permaslug
"deepseek/deepseek-chat-v3-0324"
api_type
"completions"
id
"gen-1771277381-7VWsbxzAxeWgSB58o1JS"
upstream_id
"019c685bf0061be1fb244b28844c79a9"
total_cost
0.0003975
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
Evaluation details
Result
Evaluator
Details
Meta Data
97.6276%
Matches word count
n/a
neededClean
false
words
207
0.0000%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 90.00%
neededClean
false
wordsTotal
207
wordsDialogue
0
48.8138%