NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHQENWQ2SRYYYBC5DP066Q3
Run Details
Model:
Grok 4.20 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 18s
Cost:
$0.017108
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
58.52%
Dialogue to Total Word Ratio
(details)
29.2584%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The two thieves sat in a cluttered van parked down the street from the target house, the sounds of laughter and party music drifting through the air. Molly adjusted her colorful wig while Bill checked the large sacks they planned to use for the loot. They spoke in hushed but animated tones as they finalized their outrageous plan. "Bill, this is it. The ultimate heist. Those presents won't know what hit them," Molly whispered excitedly. "Are you kidding me, Molly? We're about to ruin some kid's birthday. I feel like the grinch over here," Bill responded with a sigh. "Oh come on, it's not ruining if we do it with style. We'll blend in, steal the goods, and be gone before the candles are blown out. Now, you remember the signal?" Molly asked while applying more makeup to her face. "Yeah, the signal is when the clown starts his act. But clowns give me the creeps. Why does it have to be a clown party?" Bill said as he fidgeted with his fake mustache. "Because kids love clowns, Bill. Keep your eyes on the prize. The presents are likely piled high next to the cake table. I'll distract the adults with some lame jokes while you swoop in and bag them all," Molly said, her eyes sparkling with mischief. "But how do we carry all that without looking suspicious? There could be fifty presents in there for all we know. And what if there's a pet pony or something? This is a rich kid party," Bill questioned, peering out the window at the balloons tied to the mailbox. "No pony, I checked the invitation. We use the wagon we brought, cover it with a tablecloth and pretend it's part of the decoration. Roll it right out the front door. Simple as pie," Molly replied confidently, checking her bag one last time. "Simple? Nothing about crashing a children's party is simple, Molly. What if one of the kids recognizes us from that mall Santa incident last year?" Bill said, his voice rising slightly in panic as he glanced around nervously. "Relax, we were in full Santa suits then. Today we're generic party entertainers. Just stick to the plan. When I yell 'who wants to see a magic trick' that's your cue to start loading," Molly instructed firmly as she hid the last of her tools in her oversized pockets. "Magic trick. Right. I'll pull out a deck of cards or something. But seriously, after this are we really selling teddy bears and action figures on the black market? It seems beneath us," Bill muttered while shaking his head in disbelief. "Beneath us? Bill, this is genius. No security, no alarms, just lots of noise and distraction. The parents will be too busy herding the kids to notice until it's too late. Then we split the profits and maybe buy ourselves a nice dinner," Molly said with a mischievous grin spreading across her face. "Alright, you convinced me. But if we get chased by a bunch of angry six year olds, I'm blaming you entirely," Bill conceded with a reluctant smile as he zipped up his jacket. "That's the spirit. Now let's synchronize our watches. It's almost showtime for the greatest present heist of all time," Molly declared as they prepared to exit the van and embark on their comedic caper.
created_at
"2026-05-01T12:15:42.496Z"
5/1/2026, 12:15:42 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
72864
moderation_latency
(null)
generation_time
77613
tokens_prompt
77
tokens_completion
2478
native_tokens_prompt
184
native_tokens_completion
6778
native_tokens_completion_images
(null)
native_tokens_reasoning
6105
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0171078
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"697a2ee7-7eb5-9e0a-a06c-051451d9f2cc"
is_byok
false
latency
120
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777637742-59hAnWEJXFob0OHVWWri"
session_id
(null)
api_type
"completions"
id
"gen-1777637742-6GRQSryZyuMJqKcvYNXh"
upstream_id
"697a2ee7-7eb5-9e0a-a06c-051451d9f2cc"
total_cost
0.0171078
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
559
58.52%
Dialogue to Total Word Ratio
Ratio: 66.73%, Deviation: 3.27%
neededClean
false
wordsTotal
559
wordsDialogue
373
29.2584%