NC Bench
Overview
About
Tests
Models
Model
GPT-4o, Aug. 6th (temp=0)
Test
Dialogue tags
Scenario
Write 200 words with 10% dialogue
Run ID
2024-08-06-01J4MXVV6JW5JBR6F93KBDE2D1
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The dimly lit basement was cluttered with blueprints and half-eaten pizza boxes. Molly leaned over the table, her eyes gleaming with mischief. "Bill, you sure about this? A children's party?" Bill, adjusting his ski mask, grinned. "Think about it, Molly. Who'd suspect a heist at a kid's birthday bash?" Molly chuckled, tapping a finger on the map. "Alright, but how do we get in without raising eyebrows?" Bill pointed to a spot marked 'Clown Entrance.' "We go in as entertainment. You juggle, I do balloon animals." Molly raised an eyebrow. "Balloon animals? You can barely tie your shoelaces." "Details, details," Bill waved dismissively. "We just need to distract them long enough to grab the loot." Molly nodded, her mind racing with possibilities. "And the parents?" Bill shrugged. "We’ll say we’re from the agency. Who questions a clown?" Molly laughed, shaking her head. "This is either genius or the dumbest plan ever." Bill winked. "That's why it's perfect. Now, let's practice our act." Molly picked up a rubber nose, tossing it to Bill. "Alright, partner. Let's make some magic." As they rehearsed their routine, the absurdity of their plan only fueled their excitement. The heist of the century was about to unfold, one balloon animal at a time.
id
"gen-WKq5K8kWqGYtBVqmv9jOuKcgxmia"
total_cost
0.0031575
upstream_id
"chatcmpl-9tNDG0Tj6Un2ZA44L3bi71Cx8qKfp"
created_at
"2024-08-06T22:45:01.382627+00:00"
Tuesday, August 6th, 2024 at 10:45:01 PM GMT+00:00
model
"openai/gpt-4o-2024-08-06"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
3285
moderation_latency
2219
generation_time
0
finish_reason
"stop"
tokens_prompt
68
tokens_completion
301
native_tokens_prompt
67
native_tokens_completion
299
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0031575
Evaluation details
Result
Evaluator
Details
Meta Data
95.9868%
Matches word count
n/a
neededClean
false
words
208
0.0000%
Dialogue to Total Word Ratio
Ratio: 45.19%, Deviation: 35.19%
neededClean
false
wordsTotal
208
wordsDialogue
94
47.9934%