Run: 2025-02-24-01JMX20AWE7TQ4AHXGPDB8SNNT

Run Details

Model:: Claude 3.7 Sonnet
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 6.3s
Cost:: $0.004770
Provider:: Anthropic

Evaluation summary

Score	Evaluator	Details Link
68.10%	Matches word count	(details)
99.69%	Dialogue to Total Word Ratio	(details)
83.8948%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

"Molly, I can't believe we're doing this," Bill whispered, adjusting his fake mustache in the side mirror of their van.

"What? Having second thoughts about stealing from children?" Molly snorted while loading party poppers into her jacket pockets.

"Not second thoughts. Third, fourth, and fifth thoughts, actually."

"Look, we've hit banks, museums, and that weird guy's ferret collection. This is just another job."

"But they're kids, Mol. Little humans with their little faces and little expectations of birthday joy."

"Little humans with expensive gaming consoles and designer clothes as gifts. Have you seen the present table through the window? It's like a mini Best Buy in there."

"I feel dirty. And not the good kind of dirty."

"You can feel dirty while counting cash later."

"What's our cover again?"

"We're the backup clowns. The real one called in sick."

"I hate clowns."

"Everyone hates clowns. That's why no one will approach us."

"Smart."

"Always am. Ready?"

"No."

"Perfect. On three, we enter, entertain for exactly seven minutes, then hit the gift table during the cake cutting."

"This is a new low."

"New lows pay surprisingly well."

Evaluation details

Result

Evaluator

Details

Meta Data

68.10%

Matches word count

n/a

neededClean	false
words	186

99.69%

Dialogue to Total Word Ratio

Ratio: 87.63%, Deviation: 2.37%

neededClean	false
wordsTotal	186
wordsDialogue	163

83.8948%

id	"gen-1740436942-gxkBJyxfpqNBAFXWbdn2"
upstream_id	"msg_01J81vBpqrpC17EnacquLrST"
total_cost	0.00477
cache_discount	(null)
provider_name	"Anthropic"
created_at	"2025-02-24T22:42:28.500629+00:00" 2/24/2025, 10:42:28 PM
model	"anthropic/claude-3-7-sonnet-20250219:beta"
app_id	182717
streamed	true
cancelled	false
latency	321
moderation_latency	(null)
generation_time	5439
tokens_prompt	68
tokens_completion	254
native_tokens_prompt	75
native_tokens_completion	303
native_tokens_reasoning	0
num_media_prompt	(null)
num_media_completion	(null)
num_search_results	(null)
origin	"https://ncbench.com/"
is_byok	false
finish_reason	"stop"
native_finish_reason	"stop"
usage	0.00477