NC Bench
Overview
About
Tests
Models
Model
GPT-4o, May 13th (temp=0)
Test
Dialogue tags
Scenario
Write 200 words with 90% dialogue
Run ID
2024-07-19-01J35SJ4G88FZEG66ECVFRQFZX
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Alright, Bill, you got the plan?" "Yeah, Molly, but remind me why we're stealing from a children's party again?" "Because, Bill, no one expects it. Plus, those presents are probably worth a fortune." "Right, right. So, we sneak in dressed as clowns?" "Exactly. You got your costume ready?" "Of course. Red nose, big shoes, the whole shebang. What about you?" "Rainbow wig, oversized pants, and a squirting flower. I'm all set." "Perfect. Now, how do we get past the parents?" "Simple. We distract them with balloon animals. You know how to make a giraffe, right?" "Uh, I can manage a snake. Will that do?" "Fine, just keep them busy. I'll handle the presents." "And what if a kid recognizes us?" "Bill, we're clowns. No one will recognize us. Just keep smiling and honking your horn." "Got it. And the getaway?" "Parked the van around the corner. We load up and split in five minutes." "Five minutes? That's tight." "Trust me, Bill. In and out. No one will even notice." "Alright, Molly. Let's do this. For the kids." "For the kids, Bill. And their presents." "Ready?" "Ready. Let's go make some kids cry." "Best heist ever." "Absolutely."
id
"gen-Pc3TNjlyPZQ6OwG6ziwqW8X8sEz7"
total_cost
0.0047
upstream_id
"chatcmpl-9mjm1W8aidOlbNbfRHAYk6YKVv9Kf"
created_at
"2024-07-19T15:25:30.911992+00:00"
Friday, July 19th, 2024 at 3:25:30 PM GMT+00:00
model
"openai/gpt-4o-2024-05-13"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
5508
moderation_latency
201
generation_time
0
finish_reason
"stop"
tokens_prompt
68
tokens_completion
298
native_tokens_prompt
67
native_tokens_completion
291
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0047
Evaluation details
Result
Evaluator
Details
Meta Data
99.7443%
Matches word count
n/a
neededClean
false
words
196
36.7879%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
196
wordsDialogue
196
68.2661%