NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHRRGFMCDK45NCDBBGTE2DX
Run Details
Model:
Grok 4.20 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 56s
Cost:
$0.026680
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
51.28%
Dialogue to Total Word Ratio
(details)
25.6387%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched low behind the neatly trimmed hedges that surrounded the lively children's party venue. Bright balloons swayed in the breeze while joyful shouts from excited kids filled the air around the inflatable bouncy castle and game area. The stack of presents was their prize, a colorful heap that promised a decent score for the pair of experienced burglars who had fallen on hard times. They had spent the last hour observing the patterns of the party and now it was time to finalize their strategy for this most unusual heist in their long career of comedic capers. "We strike when the cake is served," Molly said. Bill frowned deeply. "Cake? Why then of all times?" "It is when everyone is sitting down and paying attention to the birthday child. The table will be free for us to clean out," Molly explained. "I still think this is crossing a line. These are children's gifts we are talking about stealing here," Bill protested. "Don't be such a softie. We will leave some dollar store replacements so they are not completely disappointed. It is all part of the plan," Molly responded with a laugh. Bill shook his head but knew better than to argue further with his partner in crime. The suburban setting made him uneasy but the lack of professional security was a huge plus. They had their disguises ready, a pair of generic party clown outfits that would allow them to blend in with any hired entertainers. The van was parked just around the corner for a quick getaway. "Fine but I am not wearing the red nose. It makes me sneeze," Bill added. "Deal. Now get ready because it looks like they are bringing out the cake any minute now," Molly replied. "What if we get caught by a parent?" Bill asked nervously. "Then we say we are with the entertainment company and make a run for it," Molly said calmly. "This will be the funniest heist story ever told," Molly said. The two thieves checked their bags one last time to ensure they had all the tools of the trade including the replacement gifts which were poorly wrapped boxes with nothing inside but crumpled newspaper to mimic the weight of real items. The sound of singing could be heard starting up which meant their moment had arrived at last after all the careful planning. With a nod to each other they emerged from their hiding spot and casually walked toward the party as if they belonged there all along. The comedic heist was underway and both Molly and Bill felt the familiar rush of adrenaline mixed with the silliness of their current endeavor. If successful they would have a van full of presents to fence or perhaps even keep some for themselves as a joke. The party guests paid them no mind as they approached the table laden with the coveted items. It seemed their plan was working perfectly so far as the children were all distracted by the large chocolate cake being carried out with seven lit candles flickering in the wind. This was the life they had chosen and today it involved stealing from a children's party in the most humorous way possible.
created_at
"2026-05-01T12:38:33.209Z"
5/1/2026, 12:38:33 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
112516
moderation_latency
(null)
generation_time
116263
tokens_prompt
77
tokens_completion
3410
native_tokens_prompt
184
native_tokens_completion
10607
native_tokens_completion_images
(null)
native_tokens_reasoning
10002
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0266803
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"7b464baf-84ce-2946-9f2b-46bae499e7ad"
is_byok
false
latency
100
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777639113-cRpjf58WWZ3oFX8RRGqm"
session_id
(null)
api_type
"completions"
id
"gen-1777639113-ka3RvDCylqGYV68ZmJp2"
upstream_id
"7b464baf-84ce-2946-9f2b-46bae499e7ad"
total_cost
0.0266803
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
539
51.28%
Dialogue to Total Word Ratio
Ratio: 26.35%, Deviation: 3.65%
neededClean
false
wordsTotal
539
wordsDialogue
142
25.6387%