NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2JE5QQYYQNC10MK9CYZVDE
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
10.5s
Cost:
$0.000206
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
1.04%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.5180%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The list lay between them on the greasy diner table, a single sheet of paper covered in Molly’s precise, tiny handwriting. “Alright, run it again,” she said, tapping the page with a forkful of pancake. “Entry vector.” Bill sighed, rubbing his temples. “Through the bouncy castle. I know. It’s stupid.” “It’s inspired,” Molly corrected. “The rental guy pumps it up at eight, then goes for a smoke. We crawl through the deflated fold and pop out behind the cupcake table. They never see us until we’re inside the tent.” “Inside the tent is where the four-year-olds are, Molly.” “Irrelevant. They’re short. We’re tall. We’ve got the height advantage for grabbing gift bags off the trellis.” Bill licked his thumb and flipped a page of the party planner he’d lifted from the mother’s minivan. “The ‘Star Wars’ theme means a lot of cardboard droids. Trip hazards.” “You’re worried about tripping on a cardboard R2-D2?” She took a bite of pancake, chewing thoughtfully. “Okay. Fine. We broom it.” “Broom it?” “I bring a push broom. We sweep a path. It’s called problem-solving, Bill. Now, the moment of truth.” She leaned in, voice dropping to a whisper. “After the piñata, the parents take photos for twenty minutes. That’s our window. You snag the Amazon boxes from under the photo-booth table. I hit the pile by the front door.” “And if a kid sees me with the boxes?” “You kneel. Look them in the eye. Say, ‘I’m Santa’s intern. This is a surprise.’ They’ll believe you. They still believe the dog is a superhero.” Bill stared at her. “You’ve thought about this a lot.” “I’ve been to three children’s parties this month. You pick up patterns. The screaming is a constant, but the screaming shifts in tone. Right before the cake, it’s a low, hungry whine. That’s when they’re weakest.” She grinned. “We strike at cake time.” “What about the magician?” “He’s allergic to cats. There’s a PTA mom who brings a cat stroller. He’ll be sneezing into his top hat by the time the candles are lit.” She wiped her fingers on a napkin and crumpled the list. “So. You in or out?” Bill looked at the list in her hand, then out the window at the bright blue bounce castle already inflating in the park across the street. “One last question. What do we do with the presents?” Molly slid the napkin across the table. On it, she’d drawn a second bouncy castle. “We launch them back from where we came. The rental guy leaves the air pump on. We tie the bags to the tethers, let them fly, and catch them behind the hedge.” “That’s insane.” “No. That’s a three-point plan.” She stood, pocketed the napkin, and grabbed his arm. “Now come on. The magician’s car just pulled up. And his nose is already twitching.”
created_at
"2026-04-25T14:59:26.587Z"
4/25/2026, 2:59:26 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
548
moderation_latency
(null)
generation_time
9931
tokens_prompt
77
tokens_completion
767
native_tokens_prompt
67
native_tokens_completion
702
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00020594
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"49be477e-688f-46c2-b20b-93ab5c60151e"
is_byok
false
latency
286
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777129166-RrXsnW5uyzjHp1tAe4Mr"
session_id
(null)
api_type
"completions"
id
"gen-1777129166-J9WUM1UqjHCj4hVFFTJu"
upstream_id
"49be477e-688f-46c2-b20b-93ab5c60151e"
total_cost
0.00020594
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
1.04%
Matches word count
n/a
neededClean
false
words
474
0.00%
Dialogue to Total Word Ratio
Ratio: 70.98%, Deviation: 40.98%
neededClean
false
wordsTotal
479
wordsDialogue
340
0.5180%