NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2G34SFA70W6Z4WJX0YJ8ZQ
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
15m 45s
Cost:
$0.111468
Provider:
Together
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
83.49%
Dialogue to Total Word Ratio
(details)
91.7457%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread out the map. "Okay, Bill. The party is at noon. All presents on one table." "And we just grab them? With thirty kids watching?" Bill said. "We go as entertainers. I'll be a clown, you be a magician." "I can't do magic. And clown makeup makes me break out." "Then be a pirate. I'll be a fairy. We hand out candy, then grab the goods." "Pirate-fairy heist? That's new. But what about the parents? They'll notice." "We need a diversion. You start a treasure hunt. Kids love maps." "And you? You crawl to the table?" "No, I'll dance. A fairy dance. While they watch, you slip away." "Slip away with a giant bag of presents? Subtle as a sledgehammer." "We'll use a magic trick. A disappearing present trick. They'll applaud." "Until they see us running down the street. This is your worst plan yet." "Backup: hire a clown. He distracts, we steal." "With what money? The last clown stole our masks." "Fine. We walk in, say we're from charity, donate gifts." "Ridiculous. But what's the catch?" "No catch. We are the charity." "Let's try it. If we fail, ice cream's on you." "Deal! Let's steal some birthday joy."
created_at
"2026-04-25T14:18:28.019Z"
4/25/2026, 2:18:28 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
391
moderation_latency
(null)
generation_time
944865
tokens_prompt
77
tokens_completion
19641
native_tokens_prompt
146
native_tokens_completion
25264
native_tokens_completion_images
(null)
native_tokens_reasoning
19347
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.1114682
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
is_byok
false
latency
264
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
429
1
endpoint_id
"13bda129-f8a6-4a60-b1db-98082e5f7ccd"
id
"ogZbtzv-6z2FuN-9f1dfa26d04aad84"
is_byok
false
latency
391
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"Together"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777126708-jN9s8xqmT6xYAG9QWVGj"
session_id
(null)
api_type
"completions"
id
"gen-1777126708-STufJ5fTtZCo7up8tTBN"
upstream_id
"ogZbtzv-6z2FuN-9f1dfa26d04aad84"
total_cost
0.1114682
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Together"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
83.49%
Dialogue to Total Word Ratio
Ratio: 96.52%, Deviation: 6.52%
neededClean
false
wordsTotal
201
wordsDialogue
194
91.7457%