NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5J5C7RRYEE4QEMHD3ND95
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
35.4s
Cost:
$0.000267
Provider:
DeepInfra
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly traced a finger over the crayon-smeared invitation. “The target is a five-year-old named Timmy. His weakness is cake.” Bill frowned at the schematic she’d drawn on a napkin. “The perimeter is a nightmare. Balloon arches, a bouncy castle… it’s a non-lethal fortress. Our entry point is the side gate during the piñata chaos.” He shuddered. “I hate clowns.” “The clown is our inside man,” Molly said, tapping a photo of a despondent-looking man in a rainbow wig. “He’s bitter about his career. He’ll create a diversion involving six doves and a malfunctioning handkerchief.” Bill nodded, studying the exfiltration route. “We load the loot into the pedal-powered ice cream truck. It’s inconspicuous.” He paused. “What about the noise? Children are unpredictable.” “We’re not silencing a vault, Bill. We’re grabbing gift bags. The real challenge is the wrapping paper. It’s crinkly.” She held up a tiny, glittery sweater. “Our disguises. We blend in as eccentric relatives.” Bill eyed the sequined monstrosity. “This is a downgrade from the Frankfurt job.” “The Frankfurt diamonds didn’t have a gift receipt. This is clean work. In and out before the magician pulls a rabbit from his hat.” “I still hate clowns,” Bill muttered, pulling on the sweater. It was very, very tight.
created_at
"2026-02-16T21:27:10.687021+00:00"
2/16/2026, 9:27:10 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1359
moderation_latency
(null)
generation_time
35308
tokens_prompt
95
tokens_completion
358
native_tokens_prompt
67
native_tokens_completion
321
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
5
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000267260000001
router
(null)
provider_responses
0
id
"cmpl-70f1333a99bb4027a1dd0e2aff43b94a"
status
200
is_byok
false
latency
58
endpoint_id
"012f7d1f-23d4-4d03-9aa3-432d8ad82757"
provider_name
"DeepInfra"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277194-pCbyADWYEb5mUG8jSo0U"
upstream_id
"cmpl-70f1333a99bb4027a1dd0e2aff43b94a"
total_cost
0.000267260000001
cache_discount
3.99999999e-7
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
75.1556%
Matches word count
n/a
neededClean
false
words
213
0.0000%
Dialogue to Total Word Ratio
Ratio: 69.01%, Deviation: 59.01%
neededClean
false
wordsTotal
213
wordsDialogue
147
37.5778%