NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZKZF2G48KMD2P7F4AQ6BX
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3.8s
Cost:
$0.003950
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
48.8138%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her oversized clown wig and peered through the chain-link fence at the bouncing castle. “This is the dumbest idea you’ve ever had, Bill, and that’s saying something after the laundromat heist.” Bill tugged at his too-tight rainbow suspenders. “It’s genius. Twenty screaming kids, one distracted mom, and a mountain of brightly wrapped presents. We’re in and out before the cake’s even cut.” “We’re stealing from children. I feel like I should be growing a mustache just so I can twirl it evilly.” “Think of it as liberation. Those kids don’t appreciate the pure chaos of new toys yet. We do. Besides, half those gifts are probably socks from Aunt Karen.” Molly snorted. “Socks are not worth a felony.” “These aren’t regular socks. These are Paw Patrol socks. Limited edition. Black market gold.” A burst of giggles erupted from the party twenty yards away. A small boy in a dinosaur costume sprinted past the fence clutching a juice box like a grenade. Bill crouched lower. “Okay, plan. I go in as Giggles the Clown. You’re my lovely assistant, Sprinkles. We do five minutes of balloon animals, then while the birthday boy’s blowing out candles we swap the gift table with an identical empty one I stashed in the van.” “That’s not a plan, that’s a fever dream with balloons.” “It’s a great plan. I spent all morning practicing my squeaky voice.” Bill demonstrated with a horrific honking noise that made a nearby squirrel flee in terror. Molly pinched the bridge of her nose. “We’re going to prison in face paint. My mother will be so proud.” “Come on, Mol. Think of the score. That new gaming console alone is worth two grand on the street. Plus whatever’s in those fancy gift bags. I heard one kid’s getting a drone.” “A drone? At seven years old? When I was seven I got a stick and was told it had character.” “Exactly. These kids are spoiled. We’re just… redistributing wealth. To us. Who need it more.” “Need it for what? Matching clown shoes?” Bill grinned. “For dignity, obviously. And maybe that vacation to Vegas we’ve been talking about.” Molly stared at him for a long moment. “If we get caught, I’m telling the cops you kidnapped me and forced me into the wig.” “Fair enough. So you’re in?” She sighed dramatically. “I hate you. Yes, I’m in. But if any of those kids cries, I’m giving the presents back and pinning the whole thing on you.” Bill clapped his hands together with glee, the bells on his wrists jingling. “That’s my girl. Now help me practice my clown laugh. It needs to be terrifying but also whimsical.” Molly muttered under her breath, “I should have stayed in accounting.” The two clowns began rehearsing their routine behind the fence, completely unaware that the birthday boy in the dinosaur suit was already watching them with enormous curious eyes, juice box still clutched in one sticky fist.
created_at
"2026-03-12T21:35:37.447Z"
3/12/2026, 9:35:37 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
321
moderation_latency
(null)
generation_time
3681
tokens_prompt
77
tokens_completion
795
native_tokens_prompt
182
native_tokens_completion
636
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0039496
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"b05666db-e255-8bde-c4ff-2eca1cb2755f"
is_byok
false
latency
284
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351337-xCUVpDY1kN50HY0xDEvP"
upstream_id
"b05666db-e255-8bde-c4ff-2eca1cb2755f"
total_cost
0.0039496
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
493
0.00%
Dialogue to Total Word Ratio
Ratio: 70.10%, Deviation: 40.10%
neededClean
false
wordsTotal
495
wordsDialogue
347
48.8138%