NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5K9D2FQ1C76S1SG9QQG6Q
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
27.4s
Cost:
$0.000835
Provider:
SiliconFlow
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The air in the cramped van was thick with the smell of old coffee and desperation. Molly sighed for what felt like the tenth time in as many minutes. "I'm just going to say it again," she started, not looking up from the brightly colored schematic laid out on a crate between them. "This feels... low." Bill adjusted the jeweler's loupe screwed into his eye, scrutinizing a grainy photograph. "It's not low, Molls. It's precision work. A soft target. Low security, high emotional payout." "High emotional payout? It's a six-year-old's birthday party. We're talking plastic dinosaurs and finger paint sets." "Nonsense!" Bill tapped the photo. "Little Tarquin Wellington-St. John. His father is 'Additive Manufacturing Solutions' CEO. That means cutting-edge, pricey toys. We're talking mini-drones with AI, interactive smart-globes, programmable robot dogs." "A robot dog?" Molly perked up slightly. "The Chihuahua 3000. Retails for eight hundred quid. Now, look here." He pointed a grimy finger at the schematic of the sprawling backyard. "Primary ingress is the side gate. It's a simple latch. Distraction will be the key." "What's the distraction? A fake ice cream truck? A rogue clown?" Bill smiled, a proud glint in his eye. "Better. *You* are the distraction." Molly stared at him. "Excuse me?" "You'll go as 'Auntie Marjorie,' a distant relative. You'll bring a large, cumbersome gift—a wobbly Jenga set, perhaps—and upon entering, you will tragically and noisily suffer a mild allergic reaction to the family cat." "I'm not allergic to cats." "Today you are! A few theatrical sneezes, a dramatic stumble into the buffet table. While all eyes are on you and the collapsing sausage rolls, I slip in through the gate, dressed as a caterer." "Caterer? Bill, you look like you wrestle badgers for a living." "That's what the uniform is for! I'll have a tray. Very official. I blend, I mingle, I identify the gift pile, and I start loading the tray. 'Just taking these to the gift-wrapping station, madam.' It's foolproof." "The gift-wrapping station? Who has a gift-wrapping station *at* the party?" "Rich people, Molly! They have stations for everything! A standing-still station! A leaf-appreciation station! Don't question the methodology." Molly massaged her temples. "Okay, say it works. Extraction?" "The same way. You, recovered from your 'fit,' offer to help the overwhelmed caterer—that's me—carry some of the heavier gifts inside. We walk right out the front door and into the van." "What about the actual caterers?" "We'll have a sign on the van that says 'Premium Catering.' They'll just think we're the B-team." A long silence filled the van, broken only by the distant sound of traffic. Molly finally looked at Bill, his face a mixture of sheer genius and utter madness. "So, to recap," she said slowly. "Our grand heist hinges on me fake-sneezing, you pretending to know how to carry a tray, and us hoping that a bunch of adults at a child's party are too polite to question a man carrying a pile of presents towards a van that says 'Premium Catering'." Bill beamed. "See? You've got it. It's elegant in its simplicity." Molly picked up the photograph of little Tarquin, who was staring blankly at an expensive-looking pony. She sighed one last, long sigh. "Fine. But I'm keeping the robot dog." "Deal," said Bill, slamming the schematic with his hand. "Operation: Ponies and Plunder is a go!"
created_at
"2026-02-16T21:27:39.478686+00:00"
2/16/2026, 9:27:39 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2151
moderation_latency
(null)
generation_time
27339
tokens_prompt
95
tokens_completion
901
native_tokens_prompt
70
native_tokens_completion
816
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0008349
router
(null)
provider_responses
0
id
"019c6859aa818a4b80938fb1ad9e805f"
status
200
is_byok
false
latency
2122
endpoint_id
"5d5c9c1d-af0d-4e4a-8aee-c5cd2b3883f3"
provider_name
"SiliconFlow"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277231-kWnFRx5JCCRtdBhAA49W"
upstream_id
"019c6859aa818a4b80938fb1ad9e805f"
total_cost
0.0008349
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
575
78.9548%
Dialogue to Total Word Ratio
Ratio: 72.17%, Deviation: 2.17%
neededClean
false
wordsTotal
575
wordsDialogue
415
39.4774%