NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHSSE0S8X50HTJB053VEFT8
Run Details
Model:
Grok 4.20
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
11.6s
Cost:
$0.001870
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0438%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her oversized sunglasses and leaned across the wobbly diner table. “Okay, Bill, hear me out. We hit the birthday party at exactly three-fifteen, right when they’re doing the piñata. Chaos is our friend.” Bill slurped his coffee and grinned like a man who’d just discovered fire. “Piñata, eh? I like it. We dress as clowns?” “Absolutely not. Last time you wore that rainbow wig you cried because the kids said you looked ‘creepy.’ We go as disgruntled parents. I’ve got the perfect soccer-mom yoga pants and everything.” “Disgruntled parents,” Bill repeated, tasting the words. “Brilliant. I can complain about the gluten-free cake. Nobody questions a dad yelling about carbs.” Molly tapped her notebook. “Focus. There’s a mountain of presents under the balloon arch. Target is the big silver one with the bow the size of a satellite dish. That’s little Timmy’s new gaming console. We nab that, plus the gift cards taped to the smaller boxes. Easy two grand minimum on the resale market.” Bill leaned in, eyes sparkling. “How do we get past the bouncy castle guard? I heard they hired Kevin from down the street. Man’s built like a refrigerator with legs.” “We create a diversion. You ‘accidentally’ knock over the punch bowl. While every adult within fifty yards is mopping up Hawaiian fruit punch, I slip behind the gift table like a ninja. I can fit eight presents in my giant mom purse. Tested it yesterday.” “You practiced purse theft on your own mother again, didn’t you?” “She’s fine. She thinks it’s cute. Anyway, once I’ve cleared the table you hit the backyard and start the car. We’ll be gone before they even sing Happy Birthday.” Bill rubbed his chin. “What if the kids notice? They’ve got eyes like hawks. One of them points and yells ‘That lady stole my Pokémon cards!’ and suddenly we’re on the six o’clock news.” Molly waved a dismissive hand. “Kids that age have the attention span of goldfish on espresso. By the time they realize their presents are missing they’ll be hopped up on cake and juice boxes. We’ll be halfway to the fence by then.” “What about the parents? Karen-types with their ring cameras and neighborhood watch apps?” “That’s why we park the getaway car behind the neighbor’s hedge. Mrs. Abernathy always leaves her gate unlocked because she trusts the universe. Poor woman. We’ll be in and out in four minutes flat. Five if you stop to eat the birthday cake.” Bill chuckled. “I do love a good sheet cake. All right, I’m in. But if this goes south I’m blaming your yoga pants. Nobody’s going to buy me as a stressed-out dad in cargo shorts.” Molly slid a fresh napkin across the table and sketched a crude map. “Relax. We hit them where it hurts: right in the birthday feels. By the time they unwrap their disappointment we’ll be counting cash and eating stolen cupcakes. You ready to become the Bonnie and Clyde of children’s parties?” Bill raised his coffee mug in salute. “To stealing joy, one glittery gift bag at a time.” They clinked mugs, grinning like idiots who’d just discovered the world’s easiest crime.
created_at
"2026-05-01T12:56:32.043Z"
5/1/2026, 12:56:32 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
407
moderation_latency
(null)
generation_time
11245
tokens_prompt
77
tokens_completion
847
native_tokens_prompt
182
native_tokens_completion
684
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0018703
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"9caff6c8-f8b0-9385-90d7-26e2d2c48b92"
is_byok
false
latency
76
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777640192-sewXWWbTLuUXGtn2uM3G"
session_id
(null)
api_type
"completions"
id
"gen-1777640192-5ifINiiPdVDg09Xx8uFt"
upstream_id
"9caff6c8-f8b0-9385-90d7-26e2d2c48b92"
total_cost
0.0018703
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
529
0.00%
Dialogue to Total Word Ratio
Ratio: 84.46%, Deviation: 14.46%
neededClean
false
wordsTotal
534
wordsDialogue
451
0.0438%