NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHQHVJ4SA78TERBMVG1KQAM
Run Details
Model:
Grok 4.20 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
2m 7s
Cost:
$0.027198
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.01%
Matches word count
(details)
75.54%
Dialogue to Total Word Ratio
(details)
37.7763%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The two thieves huddled in the back of their unmarked van parked a block away from the bustling children's party venue. Brightly colored balloons could be seen tied to the mailbox in the distance, and high-pitched laughter echoed from the backyard where games were already underway. Molly unfolded a brightly colored map of the house layout while Bill adjusted his oversized clown wig, already complaining under his breath about the itchy fabric. The plan they were concocting was bold, ridiculous, and potentially their most memorable heist if they could pull it off without causing a scene filled with tiny tantrums and screaming parents. Years of thievery had led them to this moment, where the ultimate score was a pile of shiny new toys and electronics wrapped in festive paper. They had spent days casing the joint, noting the schedules of the hired entertainers and the layout of the backyard playset that could provide excellent cover for their sneaky maneuvers. "Molly, are we really doing this? Stealing presents from kids?" Bill asked. "Yes, and it's going to be hilarious. Those gifts are ours for the taking," Molly replied. "But the clownsuits? I feel silly," he said. "That's the point. No one suspects the clowns. During the cake cutting we make our move on the table," she explained. "What if a kid sees us?" Bill wondered. "We offer them a balloon animal and tell them it's magic. Now focus on the exit strategy," Molly said. "The van better be ready. This could go wrong fast," he warned. "It won't. Trust me, this heist will be comedy for the ages," she assured. "Remember to act natural. Laugh a lot and honk the nose," Molly added. "I can't believe I'm the one with the red nose. Why can't I be the magician?" Bill complained. "Because you have the perfect sad clown face. It'll distract everyone," Molly chuckled. "Fine but if we get recognized it's your fault. No more kids parties after this," Bill stated. With the dialogue out of the way, the pair continued to review their strategy in depth. They discussed how Molly would distract the partygoers with her juggling skills while Bill snuck around the back to load the presents into large gift bags they had prepared. The narrative of their plan included using a remote controlled car to create a diversion on the lawn, drawing all the children away from the gift area. They had even prepared fake invitations in case they needed to bluff their way past any watchful eyes at the door. The absurdity of the situation was not lost on them, and occasional bursts of laughter punctuated their serious planning session. As they donned the rest of their ridiculous costumes, complete with face paint and oversized shoes, they felt a strange mix of nerves and excitement. This wasn't just any heist; it was a caper that would either make them legends or the butt of jokes in the criminal underworld for years to come. Checking their watches, they knew it was almost time to execute. The presents awaited, and with one final nod to each other, they stepped out of the van ready to crash the party in style.
created_at
"2026-05-01T12:17:26.6Z"
5/1/2026, 12:17:26 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
122291
moderation_latency
(null)
generation_time
127155
tokens_prompt
77
tokens_completion
2383
native_tokens_prompt
184
native_tokens_completion
10814
native_tokens_completion_images
(null)
native_tokens_reasoning
10189
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0271978
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"a2718654-1250-e172-89df-2889fe3d7b94"
is_byok
false
latency
120
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777637846-brsnGR8gfnHmGwAdRf7M"
session_id
(null)
api_type
"completions"
id
"gen-1777637846-F4zrn5xkBR81mhpVneXw"
upstream_id
"a2718654-1250-e172-89df-2889fe3d7b94"
total_cost
0.0271978
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.01%
Matches word count
n/a
neededClean
false
words
531
75.54%
Dialogue to Total Word Ratio
Ratio: 27.63%, Deviation: 2.37%
neededClean
false
wordsTotal
532
wordsDialogue
147
37.7763%