NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHYWZ7VNA0H87J5B3BMV1V6
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
5.9s
Cost:
$0.004358
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her oversized clown wig and glared at Bill across the sticky kitchen table. “This is the dumbest idea you’ve ever had, and you once suggested we rob a police station with kazoos.” Bill grinned, cheeks puffed around a stolen cupcake. “Hear me out, Mol. It’s a genius-level heist. Ten-year-old Timmy’s birthday party at the McMansion on Maple. Whole room full of presents. We’re talking Game Boys, Barbies, whatever tiny monsters want these days. We slip in, we slip out, we’re rich in Polly Pocket futures.” Molly pinched the bridge of her nose. “We’re grown adults, Bill. We’re going to crash a children’s party dressed as what exactly?” “World’s Saddest Magician and his Lovely Assistant Brenda,” he said proudly, holding up a sequined vest two sizes too small. “I’ve got the disappearing dove trick down. Well, mostly. The dove is actually a very confused pigeon named Kevin.” She stared at him. “We are not using Kevin. Last time he pooped on a mark and we had to abort the jewelry store job.” “Fine, no Kevin. But we need a distraction. I say you do that thing where you pretend to cry because your balloon animal burst. Kids love trauma. While they’re all staring at your fake tears I’ll be in the gift pile like a raccoon on payday.” Molly snorted. “And what happens when little Timmy’s dad, who’s six-foot-four and built like a fridge, asks why the magician is stuffing an Xbox into his pants?” Bill waved a dismissive hand. “I’ll tell him it’s part of the act. ‘Watch as the magical Xbox vanishes into the Pants Dimension!’ Kids eat that crap up.” She sighed. “Bill, this is low. Even for us. We stole from the Salvation Army last Christmas and I still feel guilty.” “That was different. Those coats were ugly. These presents are wrapped in shiny paper with bows. Bows, Molly. My moral compass gets very confused around bows.” Molly leaned back, arms crossed. “If we get caught I’m telling the cops you kidnapped me.” “Fair. I’ll tell them you’re my emotionally unstable sister. We’ll both get therapy on the state’s dime.” She rubbed her temples. “How do we even get in? You can’t just walk up to a bouncy castle wearing a fake mustache and a top hat.” “Already handled,” Bill said, slapping a glittery invitation on the table. “Found it in the trash. We’re Cousin Larry and his wife from out of town. They won’t know the difference. Kids’ parties are chaos. We’ll blend right in. Like two ugly footnotes in a sea of sugar-high footnotes.” Molly looked at the invitation, then at his ridiculous grinning face. Despite herself, a smile tugged at her lips. “If this goes south I’m stealing the cake and leaving you to explain the pigeon.” “Kevin stays in the van,” Bill promised solemnly. She stood up and adjusted her wig again. “Fine. But I’m picking the getaway music. If I have to listen to you hum the Mission Impossible theme one more time I will actually commit a crime.” Bill clapped his hands together. “That’s my girl! Operation Tiny Tears is a go. Let’s go ruin some childhoods and make some rent money.” Molly groaned. “God help us both.”
created_at
"2026-03-12T21:23:03.553Z"
3/12/2026, 9:23:03 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
298
moderation_latency
(null)
generation_time
5821
tokens_prompt
77
tokens_completion
855
native_tokens_prompt
182
native_tokens_completion
704
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0043576
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"d6021997-2ccb-a3d8-d5b1-00cf1deecc3b"
is_byok
false
latency
259
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350583-GfabmIfNkJccSII1YY34"
upstream_id
"d6021997-2ccb-a3d8-d5b1-00cf1deecc3b"
total_cost
0.0043576
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
539
0.00%
Dialogue to Total Word Ratio
Ratio: 79.82%, Deviation: 29.82%
neededClean
false
wordsTotal
545
wordsDialogue
435
0.0000%