NC Bench
Overview
About
Tests
Models
Model
GPT-4o, May 13th (temp=1)
Test
Dialogue tags
Scenario
Write 200 words with 50% dialogue
Run ID
2024-07-19-01J35SS2MVKKDG47PJ35Y8ZRBQ
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind the inflatable castle, the distant sound of children’s laughter mixing with the strains of circus music. "We sneak in through the back, grab the gifts, and get out," Molly whispered, her eyes darting around. Bill scratched his head. "Why do we need disguises again?" Molly rolled her eyes. "Because adults don’t belong at a children's party without looking suspicious, Bill." "Right, right. What did you get me?" She yanked a clown wig from her bag. "This. We’ll look like hired entertainment." Bill grimaced. "I hate clowns." "Deal with it. You want the loot, don't you?" He sighed, shoving the wig on his head. "Fine. What's the plan?" She pointed to the pile of presents under a colorful tent. "You distract the kids. I'll swipe the goods." "How do I distract them?" "Make balloon animals or something. You're good with your hands." Bill twiddled his fingers, pondering. "What if they ask for something complicated?" "Then juggle, do a cartwheel, anything! Just keep them busy for five minutes." He nodded reluctantly. "Okay. And your getaway plan?" She grinned, holding up a small device. "Remote control car. We'll load the presents and zoom out unnoticed." Bill chuckled. "You thought of everything, haven’t you?" "Always do. Let's go make some kids cry."
id
"gen-XFwjsKgdG93ipeTxHul0dOPT1U56"
total_cost
0.004835
upstream_id
"chatcmpl-9mjphMNoftA9pBFxKCQwEPh6Heook"
created_at
"2024-07-19T15:29:21.448577+00:00"
Friday, July 19th, 2024 at 3:29:21 PM GMT+00:00
model
"openai/gpt-4o-2024-05-13"
app_id
182717
streamed
false
cancelled
false
provider_name
"OpenAI"
latency
7992
moderation_latency
790
generation_time
0
finish_reason
"stop"
tokens_prompt
68
tokens_completion
305
native_tokens_prompt
67
native_tokens_completion
300
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.004835
Evaluation details
Result
Evaluator
Details
Meta Data
75.1556%
Matches word count
n/a
neededClean
false
words
213
12.8196%
Dialogue to Total Word Ratio
Ratio: 61.97%, Deviation: 11.97%
neededClean
false
wordsTotal
213
wordsDialogue
132
43.9876%