NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-07-23-01J3G80KGQM95TZS5FY1VH3M7Y
Run Details
Model:
Llama 3.1 70B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
3.3s
Cost:
$0.000313
Provider:
Fireworks
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
67.2078%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"I'm telling you, Bill, this is a solid plan. A kids' party, no security to speak of, and a room full of sugar-high obliviousness." "Obviousness, not obliviousness, Molly. And I'm not sure I'm comfortable with this. It's a bunch of kids' birthday presents. It's not exactly the score of a lifetime." "Come on, it's a score, nonetheless. And have you seen the sizes of those gift boxes? Those kids are probably getting iPads and bicycles. We can make a killing on eBay." "But what if we get caught? I don't want to spend my twenty-fifth birthday in a kiddie cellblock." "We won't get caught. We'll be in and out in ten minutes. I've got a disguise for us – who's going to suspect the piñata and the clown?" "Aren't you the one who always says, 'Don't dress as a clown, you're just asking for trouble'?" "Desperate times, Bill. Desperate times. Besides, I've got a great idea for a squirt gun and a whoopee cushion." "You're not taking this seriously, are you?" "I'm taking it very seriously. I'm taking thirty-five percent of the profits, and you're getting the rest. Now, are you in or out?"
id
"gen-chB8paFzGISv7ktcmBKxLclBjMud"
total_cost
0.0003132
upstream_id
"5aae525c-bc1e-4c74-bd2c-639780c1f1e6"
created_at
"2024-07-23T16:50:26.95566+00:00"
7/23/2024, 4:50:26 PM
model
"meta-llama/llama-3.1-70b-instruct"
app_id
182717
streamed
false
cancelled
false
provider_name
"Fireworks"
latency
3103
moderation_latency
(null)
generation_time
0
finish_reason
"stop"
tokens_prompt
102
tokens_completion
276
native_tokens_prompt
70
native_tokens_completion
278
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0003132
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
193
36.79%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
196
wordsDialogue
196
67.2078%