NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2025-02-24-01JMX2EJZQ6TEH2TT8DJCDSZNZ
Run Details
Model:
Claude 3.7 Sonnet
Test:
N-Length Sentences
Scenario:
Write sentences with 20 words each
Duration:
4.2s
Cost:
$0.003012
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
52.81%
Matches word count
(details)
Model input
User/Human
Please write the opening scene of a murder mystery where each sentence has exactly 20 words. Please make sure that each sentence has the same length. Only return the opening itself.
Model response
Messages
Meta
AI
The old mansion stood silent in the rain, its windows like dead eyes staring out over the expansive grounds. Detective Sarah Morgan pulled her coat tighter against the chill as she approached the imposing stone structure deliberately. Inside, the body of Charles Blackwood lay sprawled across his expensive Persian rug, a single bullet hole between his eyes. Blood had pooled beneath his head, soaking into the intricate patterns and forever staining the legacy of his wealth. Sarah knelt beside the corpse, her trained eyes scanning for details others might miss among the disturbed surroundings. The victim's hand still clutched a torn photograph, his fingers rigid in death's uncompromising grip around the mysterious clue. Across the room, a grandfather clock ticked loudly, marking time in a house where someone's had abruptly, violently ended.
id
"gen-1740437409-Bp39vSIltKAAgenP9w6n"
upstream_id
"msg_01MZhs4yAMJzohj5EJ7V1yrH"
total_cost
0.003012
cache_discount
(null)
provider_name
"Anthropic"
created_at
"2025-02-24T22:50:13.511482+00:00"
2/24/2025, 10:50:13 PM
model
"anthropic/claude-3-7-sonnet-20250219:beta"
app_id
182717
streamed
true
cancelled
false
latency
387
moderation_latency
(null)
generation_time
3436
tokens_prompt
42
tokens_completion
160
native_tokens_prompt
44
native_tokens_completion
192
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.003012
Evaluation details
Result
Evaluator
Details
Meta Data
52.81%
Matches word count
n/a
neededClean
false
sentences
7
wordCounts
0
19
1
18
2
20
3
19
4
18
5
19
6
19