Model

GPT-4o, Aug. 6th (temp=1)

Test

Dialogue tags

Scenario

Write 200 words with 90% dialogue

Run ID

2024-08-06-01J4MXVP7XHBSBMHG62HWNHXWG

Model input

User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
"Alright, Molly, what's the plan?" "Simple, Bill. We blend in, act like clumsy clowns. Kids love clowns." "Clowns? Really? I hate clowns." "Yeah, but you look like one naturally." "Very funny. Okay, what's next?" "Distraction. Got a squeaky nose and spinny bow tie ready?" "Squeaky nose? Where’d you get this stuff?" "Online shopping has everything. Now, focus. We head to the gift table." "What about the cake? A heist ain’t a heist without cake." "The cake is the diversion. I’ll pretend to trip, faceplant right into it." "You know, I always thought you had good aim." "Just do your part, Bill. Sneak the presents out in the chaos." "How am I hauling twenty gift bags out?" "Stash them in giant balloons. Hide in plain sight!" "Okay, but what if a kid wants a balloon?" "Then you distract with balloon animals. Can you make a giraffe?" "Nope, but I can make… something that sort of looks like a snake?" "Perfect. Snakes are in right now." "You’re deluded. And what if adults get suspicious?" "Charm ‘em with your accordion skills." "You know I can’t play." "Exactly why it’s charming, Bill." "So masterfully terrible it’s genius." "Just wait till they see our grand escape." "What's that, a rocket-powered pogo stick?" "You know me so well."
id"gen-04KWJgeWALWOaYoz2yObyrFyMlAS" 
total_cost0.0035675
upstream_id"chatcmpl-9tND9BUZrkMSO59dVZiKLXuq1ABR6" 
created_at"2024-08-06T22:44:54.763219+00:00" Tuesday, August 6th, 2024 at 10:44:54 PM GMT+00:00
model"openai/gpt-4o-2024-08-06" 
app_id182717
streamedfalse
cancelledfalse
provider_name"OpenAI" 
latency3671
moderation_latency286
generation_time0
finish_reason"stop" 
tokens_prompt68
tokens_completion339
native_tokens_prompt67
native_tokens_completion340
num_media_prompt(null)
num_media_completion(null)
origin"https://ncbench.com/" 
usage0.0035675

Evaluation details

Result Evaluator Details Meta Data
75.1556% Matches word count n/a
neededCleanfalse
words213
36.7879% Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededCleanfalse
wordsTotal213
wordsDialogue213
55.9718%