Model

GPT-4o, May 13th (temp=1)

Test

Dialogue tags

Scenario

Write 200 words with 90% dialogue

Run ID

2024-07-19-01J35ST6EQYQEZTR10AG9TW5D7

Model input

User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
“Bill, you got the map?” “Got it right here, Molly. Back door's unguarded at 3 p.m.” “Excellent. What about the distraction?” “Clown’s ready. He’s got a water squirter and everything.” “Perfect. Now, the kids?” “All sugared up by 2:30. They won’t know what hit them.” “You sure about this, Bill? Stealing from a children’s party?” “Ah, come on, Molly. Ever heard of Robin Hood?” “Sure, but he didn’t exactly steal toys.” “Well, these are high-end, top-dollar toys. We could sell ‘em off, make a fortune.” “And how do we get out?” “Simple. Crouch down, blend in with the crab walk race.” “Genius. But what if we get caught?” “We won’t. I’ve memorized all the escape routes. And remember, we’ve got our backup plan.” “The piñata offer still stands?” “Filled with smoke bombs and glitter. Kids will love it.” “Alright, Bill. What about the parents?” “Busy with party games. Plus, I've scheduled a balloon animal show at the same time.” “This is starting to sound foolproof.” “Because it is. Just stick to the plan, Molly.” “And you got the costumes?” “Matching dinosaur onesies. No one will suspect a thing.” “Great. Let’s make history, partner.” “High-fives all around then. Tomorrow, 2:45. Don't be late.” “I’ll bring the cake knife.” “For the cake, right?” “Sure, Bill. Sure.”
id"gen-8gS76lIEGKGRwi8N2870oQmAaMAR" 
total_cost0.005375
upstream_id"chatcmpl-9mjqH2u0XSDJ1LQCm3tM0udXJyJ9x" 
created_at"2024-07-19T15:29:57.768285+00:00" Friday, July 19th, 2024 at 3:29:57 PM GMT+00:00
model"openai/gpt-4o-2024-05-13" 
app_id182717
streamedfalse
cancelledfalse
provider_name"OpenAI" 
latency8263
moderation_latency172
generation_time0
finish_reason"stop" 
tokens_prompt68
tokens_completion336
native_tokens_prompt67
native_tokens_completion336
num_media_prompt(null)
num_media_completion(null)
origin"https://ncbench.com/" 
usage0.005375

Evaluation details

Result Evaluator Details Meta Data
51.9255% Matches word count n/a
neededCleanfalse
words216
36.7879% Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededCleanfalse
wordsTotal216
wordsDialogue216
44.3567%