NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5HVVK67JCW7XEZFXCM6HW
Run Details
Model:
DeepSeek V3 (2024-12-26)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
11.9s
Cost:
$0.000504
Provider:
Novita
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
**INT. ABANDONED WAREHOUSE – NIGHT** MOLLY: Okay, listen up, Bill. We’re hitting the Thompson kid’s birthday party tomorrow. BILL: A *kid’s party*? Molly, we steal diamonds, not action figures. MOLLY: Think bigger, genius. Presents. Mountains of ’em. Toys, gift cards, maybe even a fancy watch from some rich uncle. BILL: And how do we blend in? You gonna wear a clown nose? MOLLY: Better. We’re the “entertainment.” You’re a magician. I’m your assistant. BILL: I don’t know magic. MOLLY: Neither do most kid-party magicians. Just wave your hands and say “abracadabra” a lot. BILL: And when the kids want actual tricks? MOLLY: Distract ’em with candy. Then, while I’m “helping” with the presents, I’ll load the loot into our getaway piñata. BILL: …Getaway piñata? MOLLY: It’s a *durable* piñata. Trust me. BILL: This is the dumbest plan we’ve ever had. MOLLY: That’s why it’ll work. No one expects thieves to dress like magicians and escape in a giant papier-mâché donkey. BILL: If we get caught, I’m blaming you. MOLLY: Deal. Now practice your wand twirl. You look like you’re swatting bees. BILL: *Abracadabra*, my foot.
created_at
"2026-02-16T21:26:37.741784+00:00"
2/16/2026, 9:26:37 PM
model
"deepseek/deepseek-chat-v3"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1044
moderation_latency
(null)
generation_time
11897
tokens_prompt
77
tokens_completion
339
native_tokens_prompt
66
native_tokens_completion
367
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0005035
router
(null)
provider_responses
0
id
"54ad08368f62306d650ed5c45644885c"
status
200
is_byok
false
latency
1043
endpoint_id
"cefb5a1e-9fea-4496-9a64-4e1b11a7cc8c"
provider_name
"Novita"
model_permaslug
"deepseek/deepseek-chat-v3"
api_type
"completions"
id
"gen-1771277184-1txspxsjRjPIExSMQunw"
upstream_id
"54ad08368f62306d650ed5c45644885c"
total_cost
0.0005035
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
68.1022%
Matches word count
n/a
neededClean
false
words
186
0.0000%
Dialogue to Total Word Ratio
Ratio: 1.61%, Deviation: 88.39%
neededClean
false
wordsTotal
186
wordsDialogue
3
34.0511%