NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHYZTNBTFD8W4W232D3Y0W0
Run Details
Model:
Grok 4.20 (Beta, Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
47.3s
Cost:
$0.062778
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.36%
Dialogue to Total Word Ratio
(details)
0.1794%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill sat in their hidden van, staring at the children's party across the street. The place was filled with colorful decorations and excited kids running around with endless energy. Molly had a notebook full of notes and a determined expression on her face while Bill looked nervous and kept biting his nails. "I still think this is a terrible idea," Bill said as he peered out the window at the bouncing balloons and happy faces. "Stealing presents from a children's party could scar them for life and give us bad karma for future jobs." Molly shook her head in disagreement and checked her watch for the timing. "It's not like we're taking their candy. These presents are expensive and the parents will buy more anyway. We need this score Bill to pay off our debts from the last failed attempt." "But how do we get in without being seen by all those people?" Bill asked worriedly while shifting in his seat uncomfortably. "We take out the clown they hired and you take his place. You're about the same size," Molly replied confidently pointing to the costume in the back of the van. "Take his place? I can't do clown stuff. What if they want me to juggle or tell jokes to the birthday boy?" Bill said in panic as sweat beaded on his forehead. "You just stall them with some silly dance. Meanwhile I'll create a distraction by releasing the balloons inside from a hidden spot," Molly explained in detail as she reviewed her notes carefully. "Releasing the balloons? That might work. Then what do we do next?" Bill inquired looking slightly more convinced. "While they're all looking up you sweep the presents into the bag and head out the back door. I'll be waiting with the engine running ready for a quick getaway," she said with a smile. "What if a kid tries to stop me or starts screaming?" Bill questioned with fear in his voice. "Tell them it's part of the magic show and give them a high five or something to keep them happy," Molly suggested trying to ease his concerns. "This is going to go wrong I can feel it in my bones," Bill groaned holding his head in his hands. "Come on it will be funny. Imagine the newspaper headline Clown steals birthday joy from innocent tots," Molly laughed picturing the scene. "You have a twisted sense of humor that scares me sometimes," Bill replied with a half smile. "Perhaps but we'll be rich in stolen toys and gadgets we can fence easily. Now let's review the timing again to make sure we don't mess up. Cake is at three so we move at two fifty exactly," Molly stated firmly. "Okay but I get first pick of the gifts for my nephew," Bill negotiated trying to find some upside. "Only if you do the job without complaining anymore about the ethics," Molly countered sharply. "Deal but if we get caught I'm saying you forced me into this crazy scheme," Bill warned with a serious look. "No one will believe that you big softie with a clown nose," Molly teased poking his arm playfully. They gathered their gear and prepared to crash the party in style, hoping their comedic heist would not end in disaster or tears from disappointed children.
created_at
"2026-03-12T21:24:37.168Z"
3/12/2026, 9:24:37 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
45384
moderation_latency
(null)
generation_time
47032
tokens_prompt
77
tokens_completion
808
native_tokens_prompt
184
native_tokens_completion
10440
native_tokens_completion_images
0
native_tokens_reasoning
9810
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0627776
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"95952736-8091-b03c-faf2-87817330bd9b"
is_byok
false
latency
315
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350677-AxOWxGFFEdPTp19O4pLr"
upstream_id
"95952736-8091-b03c-faf2-87817330bd9b"
total_cost
0.0627776
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
556
0.36%
Dialogue to Total Word Ratio
Ratio: 60.61%, Deviation: 10.61%
neededClean
false
wordsTotal
556
wordsDialogue
337
0.1794%