NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM61SVWX7SP12FH9YBQ8BGP
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 18s
Cost:
$0.000704
Provider:
DeepInfra
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled of stale coffee and desperation. Molly tapped a bright pink flyer on the dusty workbench. “It’s a soft target, Bill. The softest. Like a baby’s head.” Bill squinted at the garish print. “Princess Petunia’s Fifth Birthday Extravaganza. Molly, it’s a kid’s party. At a soft-play centre. We’re not exactly the ‘ball pit’ type.” “Precisely!” Molly said, her eyes gleaming. “No one expects professionals. The security is a guy in a cheap dinosaur costume named Kevin. The vault is a glorified linen closet they call the ‘Present Pantry.’ It’s a cakewalk. Literally. There will be cake.” “Our last score was a diamond exchange in Antwerp,” Bill sighed, rubbing his temples. “Now you want us to boost a bunch of Lego sets and a probably-cursed porcelain doll.” “The guest list is a who’s who of the one percent, Bill! Little Tarquin’s parents own a tech empire. Isabella’s dad is that shipping magnate. These aren’t just toys; they’re luxury goods for the under-ten demographic. We’re looking at limited-edition collectibles, solid gold rattles, probably a few unmarked bonds tucked inside the stuffed animals by paranoid fathers.” Bill picked up a crayon from the bench, twirling it in his fingers. “What’s the play?” “Simple,” Molly said, leaning forward. “We go in as entertainers. You’ll be Professor Wiggles, the slightly sad children’s magician.” “Professor Wiggles?” “It’s non-threatening. You create a diversion during the puppet show. A big one. A ‘pull-the-fire-alarm-for-a-different-reason’ level diversion.” “What kind of diversion?” Bill asked, a note of dread in his voice. “I was thinking a confetti cannon malfunction. A really, really sticky confetti cannon malfunction. The kind that requires every adult to be elbow-deep in glitter glue for at least twenty minutes.” “And what are you doing while I’m being assaulted by craft supplies?” “I,” Molly said, producing a ridiculously frilly dress from a bag, “will be ‘Auntie Blossom,’ the incredibly efficient and slightly deaf party coordinator. I’ll be bustling about, completely ignored, with a massive gift bag. I’ll ‘reorganize’ the Present Pantry, swapping all the loot into my bag and replacing them with these.” She held up a box of lumpy, poorly wrapped packages. “What’s in those?” “Coal. Literal coal. And a note from ‘The Grumpy Gremlin’ saying they’ve been very naughty. It’s a critique of their rampant consumerism, Bill. We’re adding social commentary.” Bill stared at the frilly dress. “I hate this. I hate all of it. The risk-reward ratio is all wrong. The sheer humiliation factor…” “The take is clean, untraceable, and we can fence it all by Tuesday. No Interpol, no heat, just cash. And,” she added with a sly smile, “we get to keep the cake. A whole three-tiered vanilla-and-rainbow-sprinkle masterpiece.” Bill was silent for a long moment. The warehouse clock ticked. He looked from the pink flyer to the frilly dress, then back to Molly’s utterly serious face. A deep, weary sigh escaped him. “Do I have to wear the full magician’s outfit? The hat with the moons on it?” “The hat is non-negotiable. It sells the character.” “Fine,” he grumbled. “But if I see a single child with sticky fingers coming near me, the deal is off.” Molly clapped her hands. “Excellent! Now, let’s run through your patter. You have to learn how to make a balloon poodle. And you have to say ‘Abracadabra’ with feeling, Bill. Not like you’re ordering a hit. With joy. Try it.”
created_at
"2026-02-16T21:36:26.201028+00:00"
2/16/2026, 9:36:26 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
838
moderation_latency
(null)
generation_time
77994
tokens_prompt
95
tokens_completion
974
native_tokens_prompt
67
native_tokens_completion
874
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
5
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000704130000001
router
(null)
provider_responses
0
id
"cmpl-266f20e786f947489e701e7648d6f23e"
status
200
is_byok
false
latency
55
endpoint_id
"012f7d1f-23d4-4d03-9aa3-432d8ad82757"
provider_name
"DeepInfra"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277707-DcxzU4wAIdkQNq0wl3Vp"
upstream_id
"cmpl-266f20e786f947489e701e7648d6f23e"
total_cost
0.000704130000001
cache_discount
3.99999999e-7
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
577
7.9112%
Dialogue to Total Word Ratio
Ratio: 77.12%, Deviation: 7.12%
neededClean
false
wordsTotal
577
wordsDialogue
445
3.9556%