NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2FHYYWE363W132S7X92ZKP
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 4s
Cost:
$0.000252
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The basement smelled like old cheese and cheaper decisions. Bill was spread across a sagging couch, eating cheese puffs one at a time, inspecting each as if it might hold the secret to a better life. Molly paced beside a whiteboard covered in crayon drawings. “Okay, run it by me again,” she said, tapping the board. “But this time, pretend you’re not a man who once tried to pickpocket a mime.” Bill swallowed. “It’s beautiful. Deceptively simple.” He gestured with a cheese-dusted finger. “We are both women.” “We are. That’s not a plan, that’s a biological fact.” “No, no. We are mothers. Specifically, I am Brenda, mother of Timmy. You are Karen, mother of Bradley. We walk in during the chaos of the pin-the-tail game. The birthday boy is distracted, the parents are on their phones. We mingle. We drink punch. We take the gift table.” Molly stopped pacing. “You want us to walk into a party full of seven-year-olds and walk out with a dozen My Little Ponies and a suspicious number of board games.” “And the cake.” “We don’t have to take the cake.” “It’s right there, Molly. Untended. Buttercream.” He held up three puffs in a row, like a tiny surrender. “I’m not saying we eat it on the job. But we secure the asset.” Molly rubbed her temples. “The asset. A Dora the Explorer sheet cake.” “Financed by the sweat of middle-class brows. That’s cash flow, baby.” “What about the piñata? You want that too? The party favors? The helium from the balloons?” Bill sat up, eyes narrowing. “Now you’re thinking like a professional. But no—balloon helium is too bulky. We stick to the table. In and out before they sing. That’s the sweet spot. The sugar high hasn’t hit yet, so the kids are still manageable. The adults are already a little drunk. It’s a target-rich environment.” Molly picked up a pink crayon and drew a circle around the gift table on the board. “So our entry is as Brenda and Karen. Exit? Same.” “We each carry a large gift bag. Disguised as gifts ourselves, we leave full of other gifts. It’s a gift-ception.” “You are unbearable.” “I am brilliant.” She turned and faced him. “And what happens when Brenda is asked about Timmy’s favorite food? Or Timmy’s name?” Bill smiled. “Timmy is allergic to gluten, dairy, and joy. I’ve memorized his allergen list from a Facebook post I found on the mother’s timeline. She’s very loud about it. And his name is Timmy. Because it says ‘Timmy’ on the party banner. We’ll be fine.” Molly stared at him for a long moment. “You know what? Fine. But if you try to eat that cake before we’re in the van, I will leave you at the party. And you can explain to Brenda why you’re wearing her face.” “Deal.” Bill licked cheese dust off his thumb. “One more thing. We need to bring a present of our own. Must look convincing.” “What do you suggest?” “A gift card. The perfect crime. No value, but it looks like effort.” Molly sighed and uncapped the black marker. “Write down the timeline. And for God’s sake, put down the cheese puffs. We have a party to ruin.”
created_at
"2026-04-25T14:09:04.992Z"
4/25/2026, 2:09:04 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
9779
moderation_latency
(null)
generation_time
63880
tokens_prompt
77
tokens_completion
943
native_tokens_prompt
67
native_tokens_completion
865
native_tokens_completion_images
(null)
native_tokens_reasoning
91
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00025158
router
(null)
provider_responses
0
endpoint_id
"934a69f9-bd54-474b-beca-24560f721e12"
id
"a156cfe27b474328a9bff347611d4431"
is_byok
false
latency
35
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777126144-GDg27ReOwvfkNxHQFp5g"
session_id
(null)
api_type
"completions"
id
"gen-1777126144-wlb4XbMnl5Mkggn7HJGF"
upstream_id
"a156cfe27b474328a9bff347611d4431"
total_cost
0.00025158
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
540
0.00%
Dialogue to Total Word Ratio
Ratio: 76.46%, Deviation: 46.46%
neededClean
false
wordsTotal
548
wordsDialogue
419
0.0000%