NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHS64T0F8QABR7GN8G4YAYC
Run Details
Model:
Grok 4.20 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
2m 14s
Cost:
$0.024315
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
60.49%
Dialogue to Total Word Ratio
(details)
30.2469%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind a stack of stolen milk crates in their cluttered warehouse hideout, which was filled with various tools of the trade from previous botched jobs. Deep in the heart of the city away from prying eyes, the place smelled of dust and old pizza, a perfect lair for plotting their next move in a series of ever more inventive schemes. The dim sunlight filtered through dirty windows as they reviewed their outrageous plan to infiltrate a children's birthday party and make off with every single present. The idea had come to Molly in a flash of inspiration mixed with desperation, and now she was pitching it hard to her reluctant partner. Bill, ever the worrier, kept interrupting with concerns that ranged from practical to absurd. They both knew that this heist would either make them legends in the world of petty crime or provide endless material for their future comedy routine if they ended up in jail. "Listen, Bill, this children's party is the perfect target. Presents stacked to the ceiling. We go in, we clean up, we get out." Bill rubbed his chin thoughtfully. "I don't know Molly. Stealing from kids? That's pretty low. How do we even blend in with all those rugrats running around?" "Easy. We wear those ridiculous clown costumes we used last Halloween. You handle the juggling to distract them while I slip the gifts into our oversized bags." "We time it perfectly during the cake singing part. Everyone's focused on the birthday boy. That's when we strike fast and silent like ninjas in polka dots." Bill sighed deeply but nodded slowly. "Alright, you've convinced me. But if a kid starts crying because we took his new toy truck, I'm out of there." "You worry too much. This will be our masterpiece. The great present heist. We'll be laughing all the way to the pawn shop." After finalizing the last details of their entry and exit routes, the pair shook hands on the deal. The warehouse echoed with their shared laughter as they imagined the chaotic scene that would unfold. Children chasing after escaped balloons, parents chatting obliviously, and two bumbling thieves in full clown regalia making off with armloads of brightly wrapped packages. It was the kind of comedic heist that could only end in one of two ways: success beyond their wildest dreams or a hilarious disaster involving sprinkles and party hats. Either way, they were committed now. Bill began gathering the necessary props while Molly fine-tuned the timing down to the minute. The party was in two days, and they had a lot of preparation to do if they wanted to pull off this caper without a hitch. As they worked, the absurdity of the situation kept bringing smiles to their faces, reminding them why they stuck together as partners in crime because no one else would appreciate the humor in robbing a bunch of six year olds of their birthday joy. The excitement was palpable as they anticipated the looks on the children's faces when they discovered the empty table where the presents had been though they hoped it would spark more giggles than real tears in the end.
created_at
"2026-05-01T12:46:00.005Z"
5/1/2026, 12:46:00 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
127388
moderation_latency
(null)
generation_time
133798
tokens_prompt
77
tokens_completion
3451
native_tokens_prompt
184
native_tokens_completion
9661
native_tokens_completion_images
(null)
native_tokens_reasoning
9048
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0243153
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"fbd14d2b-0c77-9811-b2b8-e568c522d5ac"
is_byok
false
latency
335
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777639560-bKZZL0W2QoC1dVpdpz8J"
session_id
(null)
api_type
"completions"
id
"gen-1777639560-uB4MKkdwhivPTrVIyLDU"
upstream_id
"fbd14d2b-0c77-9811-b2b8-e568c522d5ac"
total_cost
0.0243153
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
532
60.49%
Dialogue to Total Word Ratio
Ratio: 26.83%, Deviation: 3.17%
neededClean
false
wordsTotal
533
wordsDialogue
143
30.2469%