NC Bench
Overview
About
Tests
Models
Model
GPT-4o, Aug. 6th (temp=1)
Test
Dialogue tags
Scenario
Write 200 words with 90% dialogue
Run ID
2024-08-06-01J4MXVP7XHBSBMHG62HWNHXWG
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Alright, Molly, what's the plan?" "Simple, Bill. We blend in, act like clumsy clowns. Kids love clowns." "Clowns? Really? I hate clowns." "Yeah, but you look like one naturally." "Very funny. Okay, what's next?" "Distraction. Got a squeaky nose and spinny bow tie ready?" "Squeaky nose? Where’d you get this stuff?" "Online shopping has everything. Now, focus. We head to the gift table." "What about the cake? A heist ain’t a heist without cake." "The cake is the diversion. I’ll pretend to trip, faceplant right into it." "You know, I always thought you had good aim." "Just do your part, Bill. Sneak the presents out in the chaos." "How am I hauling twenty gift bags out?" "Stash them in giant balloons. Hide in plain sight!" "Okay, but what if a kid wants a balloon?" "Then you distract with balloon animals. Can you make a giraffe?" "Nope, but I can make… something that sort of looks like a snake?" "Perfect. Snakes are in right now." "You’re deluded. And what if adults get suspicious?" "Charm ‘em with your accordion skills." "You know I can’t play." "Exactly why it’s charming, Bill." "So masterfully terrible it’s genius." "Just wait till they see our grand escape." "What's that, a rocket-powered pogo stick?" "You know me so well."
id
"gen-04KWJgeWALWOaYoz2yObyrFyMlAS"
total_cost
0.0035675
upstream_id
"chatcmpl-9tND9BUZrkMSO59dVZiKLXuq1ABR6"
created_at
"2024-08-06T22:44:54.763219+00:00"
Tuesday, August 6th, 2024 at 10:44:54 PM GMT+00:00
model
"openai/gpt-4o-2024-08-06"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
3671
moderation_latency
286
generation_time
0
finish_reason
"stop"
tokens_prompt
68
tokens_completion
339
native_tokens_prompt
67
native_tokens_completion
340
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0035675
Evaluation details
Result
Evaluator
Details
Meta Data
75.1556%
Matches word count
n/a
neededClean
false
words
213
36.7879%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
213
wordsDialogue
213
55.9718%