NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHT57DB2B071KKQ0EF6ASCD
Run Details
Model:
Grok 4.20 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 52s
Cost:
$0.017043
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The dimly lit basement served as their makeshift headquarters, maps and photos of the target house pinned to the wall like some deranged party planner's fever dream. Children's drawings and party invitations were scattered everywhere. Molly paced back and forth while Bill sat at the table, twirling a fake mustache between his fingers and looking far too pleased with himself. "Alright Bill, let's go over this one more time. We are actually going to crash a children's birthday party and take every single present. This isn't our usual jewelry store or museum gig. These are kids, for crying out loud. What kind of thieves are we turning into?" Molly said as she stopped pacing and stared at him intently. Bill looked up with a mischievous grin, his eyes sparkling with the thrill of the upcoming caper. "Exactly, Molly. That's what makes it brilliant. No one expects thieves at a kiddie party. The security is nonexistent. While the little ones are playing pin the tail on the donkey, we'll be pinning the presents to our getaway sled. It's like taking candy from a baby, but the candy is expensive electronics and toys worth thousands on the black market." Molly threw her hands up in exasperation. "Taking candy from a baby is exactly what it is. I feel terrible already. What if it's a little girl's dream dollhouse or a boy's first video game console? We're going to be on the naughty list forever. How do we even blend in? We don't look like typical party guests with your beer gut and my resting criminal face." "Simple. We arrive as the hired clown and assistant. I'll juggle and tell bad jokes to keep the adults laughing and the kids entertained while you use your nimble fingers to slip the presents into the large pockets of our oversized costumes. Then during the cake cutting when everyone is gathered in the dining room we make our move to the living room where the gifts are piled high like a beautiful shiny mountain," Bill detailed, standing up to demonstrate a silly juggling motion with imaginary balls. She sat down, rubbing her temples as she considered the plan. The photos showed a large pile of wrapped boxes with bright bows that practically screamed steal me. It did look tempting. But the moral implications weighed on her like an anvil made of tiny judgmental eyes. "Okay, say we get the presents. How do we carry them all out without looking suspicious? And the getaway? I hope you don't say in the car with balloons tied to it because that is exactly the kind of half-baked nonsense I'd expect from you right now," Molly inquired, her tone skeptical yet intriguingly hooked despite herself. Bill laughed heartily. "The getaway is the best part. We have the van disguised as a mobile petting zoo truck. We'll have a couple of stuffed animals in the back. If anyone asks, we're delivering a pony for the birthday girl. By the time they realize the presents are missing we'll be halfway to the fence with our haul. No one will suspect the clowns because clowns are supposed to be chaotic." "This is insane. But I suppose if we're doing it we need to make sure we leave a note or something so the kid doesn't think Santa forgot her. Maybe replace one with a cheap toy from the dollar store so we don't completely scar them for life," Molly suggested, starting to warm to the absurdity of their grand scheme. "See? Now you're thinking like a pro. A note saying the presents are on an adventure or something. It's comedic gold. They'll talk about this party for years. The great present heist by the infamous duo, Molly and Bill, legends in the underground comedy criminal circuit," he said, slapping the table in excitement. The clock on the wall ticked loudly, reminding them that the party would start in less than an hour. They gathered their costumes and tools, the nervous energy building between them as distant party music began to drift through the small basement window. "Let's just try not to traumatize any children in the process. If a kid catches us, we give him a balloon animal and run like hell," Molly added as they headed for the door. "Deal. Balloon animals for all eyewitnesses. This is going to be our funniest heist yet," Bill concluded with a chuckle.
created_at
"2026-05-01T13:02:58.479Z"
5/1/2026, 1:02:58 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
103506
moderation_latency
(null)
generation_time
112189
tokens_prompt
77
tokens_completion
2831
native_tokens_prompt
184
native_tokens_completion
6752
native_tokens_completion_images
(null)
native_tokens_reasoning
5904
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0170428
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"05f5a4c8-d9f8-9952-900c-830690ed8aec"
is_byok
false
latency
65
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777640578-UJqXIuF9zbU9zd4r4Xdn"
session_id
(null)
api_type
"completions"
id
"gen-1777640578-ENTOsh3yHlxxkisIPiVc"
upstream_id
"05f5a4c8-d9f8-9952-900c-830690ed8aec"
total_cost
0.0170428
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
737
0.00%
Dialogue to Total Word Ratio
Ratio: 67.07%, Deviation: 17.07%
neededClean
false
wordsTotal
738
wordsDialogue
495
0.0000%