NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZRB225MW7K2N9YTZ1Z5XK
Run Details
Model:
Grok 4.20 (Beta, Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
31.7s
Cost:
$0.043182
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
4.10%
Dialogue to Total Word Ratio
(details)
2.0492%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched low in the dense foliage bordering the lively children's party, their eyes fixed on the mountain of brightly wrapped presents stacked on a table near the bounce house. The air was thick with the scent of cake and the sound of joyful screams from the children playing. "Okay Bill, run the plan by me one more time. How do we break into this kids party without causing a scene?" Molly said while checking her watch nervously. Bill scratched his head and whispered back, his voice barely audible over the party noise. "We disguise as the pony ride guys. No one suspects the pony guys. We take the presents while the kids are riding the pony." Molly turned to him with a look of utter disbelief, her eyebrows raised high. "But we don't have a pony, you idiot. That's a major flaw in your plan." He gave a sheepish grin and shrugged his shoulders in the tight space. "Then we say the pony got sick. We'll use a stick horse or something. Come on, it's brilliant." The bush rustled slightly as she shifted her weight, careful not to alert anyone. "Brilliant? It's dumb. Let's think of something better. How about we crash the party as uninvited guests but say we're from the clown college." Bill shuddered at the thought, recalling past failures. "Clown college? I hate clowns. They freak me out. Remember that time in the circus heist? Disaster." She sighed heavily, pushing a branch out of her face. "That was different. No clowns this time. I'll be the clown and you can be the strong man or something. We lift the present table together and carry it out like it's part of the show." The sound of a child blowing out candles could be heard in the distance as Bill pondered. "The table is too heavy for that. We need to open the gifts there and take only the good stuff. Like the electronics, not the clothes." Molly stared at him in confusion, the party lights reflecting in her eyes. "Electronics at a children's party? It's mostly toys, Bill. We're stealing Barbies and Lego sets. What are we going to do with those?" He rubbed his chin thoughtfully, ignoring the absurdity. "Sell them on the black market. There's a huge demand for hot toys this time of year. Trust me." The wind picked up, carrying laughter their way as she crossed her arms. "Hot toys? This is not what I signed up for when I became a thief. I wanted banks and museums, not birthday parties." Bill smiled reassuringly, patting her on the back. "But banks have guns and alarms. This has cake and juice boxes. Much safer." She considered this for a moment, calculating the risks in her head. "Fine, but if we do this, we split the haul 60-40. My favor because it's your stupid idea." He chuckled softly at her demand. "60-40? That's robbery! Wait, it is robbery. Okay, 55-45." They shook hands awkwardly in the bush as the party went on. "Deal. Now, timing is everything. We go in at three o'clock when they sing happy birthday. That's the peak distraction." "Three o'clock it is. And no taking any presents with my name on them. I mean, if there are any." She smiled despite the situation. "There won't be. But if there is a big red truck, it's mine." The laughter from the party continued as they finalized their absurd plan, preparing for the most ridiculous heist of their careers.
created_at
"2026-03-12T21:38:00.39Z"
3/12/2026, 9:38:00 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
29594
moderation_latency
(null)
generation_time
31497
tokens_prompt
77
tokens_completion
853
native_tokens_prompt
184
native_tokens_completion
7174
native_tokens_completion_images
0
native_tokens_reasoning
6432
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0431816
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"0475f169-06f2-cc1e-6a28-97a462707ea8"
is_byok
false
latency
439
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351480-NdWYTcQBfzx3pIPf8VJG"
upstream_id
"0475f169-06f2-cc1e-6a28-97a462707ea8"
total_cost
0.0431816
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
588
4.10%
Dialogue to Total Word Ratio
Ratio: 57.99%, Deviation: 7.99%
neededClean
false
wordsTotal
588
wordsDialogue
341
2.0492%