NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHSPD41P8ADQ5YVW4RKWST2
Run Details
Model:
Grok 4.20 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
2m 11s
Cost:
$0.025621
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
2.01%
Matches word count
(details)
0.13%
Dialogue to Total Word Ratio
(details)
1.0711%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill leaned forward in the passenger seat, peering through the binoculars at the colorful scene. "Molly, you see that pile of presents over by the punch bowl? There must be at least a hundred gifts there. This is going to be our easiest score yet," he said. Molly shook her head in disbelief. "Easy? Bill, nothing about crashing a children's party is easy. What if they have a bouncy castle? Those things are death traps for people like us trying to make a quick exit." Bill chuckled softly. "We'll use the bouncy castle to our advantage. Jump in, cause a distraction by pretending to be part of the fun, then load up the van while the kids are jumping around. It's brilliant." She sighed, checking her watch. "Brilliant like your last idea that got us banned from three supermarkets? I don't know. Stealing toys seems beneath us. We're master thieves, not the present police." "Beneath us? Think of the money, Molly. We can flip those video games and dolls for top dollar. No one will know it was us. We'll be in and out before the candles are blown out." Molly looked at the map they had drawn. "Okay, say we do it your way. How do we get past the parents? They're all standing around like they're guarding the crown jewels." Bill pointed at their costumes in the back. "Simple. We show up in full costume. You as the princess, me as the court jester. I juggle, you sing happy birthday in a funny voice, and during the applause we start cleaning up the gifts as part of the act." Molly groaned. "Sing? You know I can't hold a tune. The last time I tried the dog howled." "Then you do the juggling. I'll sing. Happy birthday to you, now hand over the gifts too. See, it's all in the delivery." "This is insane. What if little Johnny starts crying because we took his dinosaur set?" Bill smiled confidently. "We leave a note. Dear Johnny, your presents are on an adventure. They'll be back as hand me downs in a few years. Kids love adventures." Molly crossed her arms. "Notes? That's your solution? I think we should abort this whole thing and go for the bank like normal people." "Normal people don't steal from banks anymore, Molly. It's all digital now. This is hands on. It's fun. Remember fun?" "Fun is not being chased by a mob of sugar high kids and their caffeine addicted parents." Bill put his hand on her shoulder. "Come on, partner. We've faced worse. Remember the opera house heist with the singing security guard? This is nothing." "Fine, you win. But I get to choose the getaway music. No more polka tunes." "Deal. Now let's practice. Hey kids, who's ready for magic? I am, Bill, but don't make the presents vanish for real." "See, we got the routine down. This is going to be legendary." The two thieves shared a laugh as they prepared their costumes in the dim light of the van, the absurdity of their plan only making them more determined to pull off the comedic heist.
created_at
"2026-05-01T12:54:52.806Z"
5/1/2026, 12:54:52 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
125257
moderation_latency
(null)
generation_time
130663
tokens_prompt
77
tokens_completion
3317
native_tokens_prompt
184
native_tokens_completion
10210
native_tokens_completion_images
(null)
native_tokens_reasoning
9560
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0256206
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"d0dcbc82-6696-398f-0297-90cd11b116c3"
is_byok
false
latency
114
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777640092-QUdpsQI21fjNyBg9MuOE"
session_id
(null)
api_type
"completions"
id
"gen-1777640092-dgmWEheOjCZF0XVl3rc9"
upstream_id
"d0dcbc82-6696-398f-0297-90cd11b116c3"
total_cost
0.0256206
cache_discount
0.0001344
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
2.01%
Matches word count
n/a
neededClean
false
words
525
0.13%
Dialogue to Total Word Ratio
Ratio: 81.52%, Deviation: 11.52%
neededClean
false
wordsTotal
525
wordsDialogue
428
1.0711%