NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5T89TH1TP6AJ6AC6BRYWV
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
15.0s
Cost:
$0.001396
Provider:
Fireworks
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The air in the rented van was thick with the smell of stale coffee and poor life choices. Molly tapped a well-manicured nail against a grainy photograph of a suburban McMansion, her expression one of profound disappointment. "A child's birthday party, Bill? Our last job was the Palme d'Or Diamond. This is a step down. A big, saccharine, piñata-shaped step down." Bill, a man whose wardrobe consisted entirely of tracksuits in various shades of grey, didn't look up from his task. He was meticulously wrapping a large, ornate box in glittery pink paper, his tongue pinched between his teeth in concentration. "It's not a step down, Mol. It's a lateral move. Less lasers, more juice boxes. The principle is the same." "The principle," Molly deadpanned, "is that we're about to steal a seven-year-old's Frozen-themed karaoke machine. My therapist is going to have a field day with this." "Ah, but that's where you're wrong!" Bill said, finally securing a truly horrifying bow to the package. He held it up with pride. "Little Tiffany Abernathy isn't just any seven-year-old. Her father is 'Diamond' Dan Abernathy. Retired. Runs a legitimate security firm now. All his old crew will be there, Mol. The muscle, the safecrackers, the drivers. They're all dads now. They're soft. And their presents? Untraceable, high-end cash equivalents. We’re not stealing toys. We're robbing a retirement fund with clowns." Molly snatched the photo, peering closer. She saw a burly man in a polo shirt manning a barbecue, a spatula in one hand. "Is that 'Icepick' Ike? He used to flood entire city blocks to bypass pressure sensors." "And now he's arguing about the proper char on a hot dog," Bill said, nodding sagely. "See? They're distracted. Their guard is down. The only thing they'll be focused on is making sure little Brayden doesn't have an allergic reaction to the cake frosting." "So, what's the play?" Molly asked, a flicker of her old professional curiosity cutting through her disdain. Bill produced a second, smaller box. "We're the entertainment. Specifically, we're 'Sparklepony' and her magical assistant, Glimmer." Molly stared at him. The silence in the van was absolute, broken only by the distant sound of a ice cream truck playing a sad, out-of-tune version of "Pop Goes the Weasel." "You want me to be a sparklepony." "Not just any sparklepony," Bill clarified, as if this made it better. "The *premiere* sparklepony in the tri-county area. I called in a favour, cancelled the real one. They're expecting us. We arrive, we perform a truly bewildering magic act involving scarves and a surprisingly compliant rabbit, and we load the gift table into our 'prop' van while they're all singing 'Happy Birthday'. It's elegant in its simplicity." Molly picked up the rainbow wig from the passenger seat. It smelled faintly of synthetic fiber and regret. "The things I do for a score," she muttered, pulling it on. It was itchy. "Think of the haul, Mol," Bill said, adjusting his own sequined vest. "And besides, when was the last time you got to see the pure, unadulterated joy on a child's face?" "When I gave one back his wallet in Paris," Molly said flatly. "He was so happy he cried. Now, hand me the glitter. If I'm going to be a mythical beast, I'm going to be the shiniest, most morally bankrupt mythical beast this suburb has ever seen."
created_at
"2026-02-16T21:31:15.368875+00:00"
2/16/2026, 9:31:15 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
386
moderation_latency
(null)
generation_time
14980
tokens_prompt
95
tokens_completion
861
native_tokens_prompt
68
native_tokens_completion
808
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00139552
router
(null)
provider_responses
0
id
"97ec135a-b88c-4212-9851-11a5ddc81f80"
status
200
is_byok
false
latency
386
endpoint_id
"0c00f0d2-8aad-4700-bd36-11cb9a717fde"
provider_name
"Fireworks"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277459-TkuZ9WSdUlDg44sAb0T6"
upstream_id
"97ec135a-b88c-4212-9851-11a5ddc81f80"
total_cost
0.00139552
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Fireworks"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
570
0.0531%
Dialogue to Total Word Ratio
Ratio: 62.28%, Deviation: 12.28%
neededClean
false
wordsTotal
570
wordsDialogue
355
0.0265%