NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHS6NKP6PYH0DMN8CS2W5VR
Run Details
Model:
Grok 4.20
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
12.3s
Cost:
$0.001783
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
50.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the ivy-choked fence, peering at the bouncy castle that wobbled like a drunk jellyfish. “This is the stupidest idea you’ve ever had, Bill. And that includes the time you tried to rob a mime.” Bill adjusted his tiny party hat, the elastic cutting off the circulation to his brain. “It’s foolproof, Mol. Kids are distracted by sugar and clowns. We slip in, we nab the gift table, we’re out before the piñata even knows it’s been betrayed. Plus, I brought walkie-talkies shaped like rubber duckies. Very inconspicuous.” Molly stared at him. “You’re wearing a bow tie with rockets on it.” “Character work,” he said, tugging the ridiculous thing proudly. “I’m ‘Uncle Bill.’ You’re ‘Aunt Molly, who’s definitely not casing the joint.’ We blend.” She pinched the bridge of her nose. “We’re thirty-two and thirty-four. We look like we rob places for a living because we rob places for a living.” “Exactly. No one suspects the obvious. Now, according to my intel—” “Your ‘intel’ is a crumpled invitation you fished out of a bin.” “—the presents are stacked on a table shaped like a dinosaur. T-Rex, I think. Arms too small to stop us. We hit it at peak chaos, right after the magician pulls a rabbit out of a hat that’s clearly got a false bottom. That’s our window.” Molly sighed. “Bill, if we get caught stealing Barbie Dreamhouses from six-year-olds, I’m telling the cops you kidnapped me.” “You’d never,” he grinned. “Who else would appreciate your impeccable taste in getaway vehicles? I borrowed a van with glitter murals. It’s practically invisible at a kiddie party.” “It says ‘Gary’s Clown Academy’ in foot-high letters.” “Details.” A burst of screaming laughter drifted over the fence. Someone had apparently released a dozen balloons and at least three sugar-crazed toddlers. Molly checked her watch. “Fine. But if we get swarmed by sticky fingers and questions about where babies come from, I’m throwing you to the wolves. Or the birthday boy. Same thing.” Bill offered her a fist bump. “For the haul?” She bumped it reluctantly. “For the haul. But I want the big stuffed unicorn. The one with the rainbow fart cloud on its butt.” “Done,” he said. “I’ll create a diversion with the whoopee cushion I hid in the cake. You grab the goods. Classic switcheroo.” Molly stared at the bouncy castle again. A small pirate hat sailed over the fence and landed on Bill’s head like destiny had a sense of humor. He adjusted the new hat over the party one. “See? Even the universe wants us rich.” “You’re an idiot.” “Your favorite idiot.” She allowed herself half a smirk. “Unfortunately. Let’s go ruin some childhoods, Uncle Bill.” They slipped through the gate just as a fresh wave of ear-piercing cheers erupted. The dinosaur table waited, piled high with brightly wrapped temptation. Bill gave one last tug to his rocket bow tie. “Showtime,” he whispered. Molly rolled her eyes so hard it was audible. “God help us all.”
created_at
"2026-05-01T12:46:17.21Z"
5/1/2026, 12:46:17 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
328
moderation_latency
(null)
generation_time
12237
tokens_prompt
77
tokens_completion
837
native_tokens_prompt
182
native_tokens_completion
676
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0017831
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"70b8eae4-9e1c-9003-b3a2-0fd152af74cc"
is_byok
false
latency
63
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777639577-OlJh6IX2OrxO2PMUxoEA"
session_id
(null)
api_type
"completions"
id
"gen-1777639577-75g0Kirm6mmvpmOmZZpn"
upstream_id
"70b8eae4-9e1c-9003-b3a2-0fd152af74cc"
total_cost
0.0017831
cache_discount
0.0001344
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
0.00%
Dialogue to Total Word Ratio
Ratio: 64.24%, Deviation: 34.24%
neededClean
false
wordsTotal
509
wordsDialogue
327
50.0000%