NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHRQ5TR8E4FDSZG3XBPEVV2
Run Details
Model:
Grok 4.20 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 58s
Cost:
$0.023585
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly leaned over the table with a mischievous grin. "Okay Bill, let's talk about this children's party heist. We need to steal all the presents without anyone being the wiser. I think we go in as the magicians. I'll pull rabbits out of hats and you can do the disappearing act with the gifts while the kids clap and the parents sip punch oblivious to the whole thing." Bill rubbed his temples. "Molly, you always have the wild ideas. A children's party? Those kids will be everywhere crawling under tables and asking questions. How do we explain our presence? Hi we're here to steal your stuff but first watch this trick. It won't fly and magic? I don't know the first thing about it plus what if a parent recognizes me from the grocery store last week." "It will fly because we'll have fake business cards and everything," Molly insisted. "The parents are so busy socializing they won't pay attention to two new entertainers. We set up a small stage do a few tricks to draw the crowd and while they're watching the show the presents vanish like magic. It's poetic. The disappearing presents trick. What do you think about parking the van behind the big oak tree for quick loading?" "I think it's going to end in tears and a police chase," Bill answered. "But let's say we try. What's the getaway plan? We can't exactly run with arms full of boxes and if one of the kids wants to help with the magic what then? Can I see the rabbit mister? Then we're exposed and the whole neighborhood joins the pursuit screaming for their stolen toys." "We have a wagon that looks like a party favor cart," Molly said. "We load it up and say we're taking them for a special game called present parade. The wagon goes straight into the van which is parked close by with music playing to cover any noise. For the kids who want to help we give them a small toy to distract them. It's all about misdirection like any good heist or magic act and we can use the bouncy castle as an emergency hideout if needed." "Misdirection I like that," Bill said. "Okay I'm starting to see the vision. We need code phrases though. Something innocent like the cake looks delicious means we're good to go and the cake is too sweet means there's trouble and we need to leave immediately. Also I want to wear a fake beard so no one recognizes me from the neighborhood and you should have that bright wig to complete the look." "Fake beard is good but make sure it doesn't fall off during the act," Molly replied. "The last thing we need is the beard dropping into the punch bowl causing more chaos than we planned. And yes to the code phrases. This is why we make a great team. You think of the practical stuff and I come up with the big idea. Now after we steal the presents we should send a clown to the house later to apologize with a new present or something to ease the guilt because I know you'll feel bad otherwise." "Guilt is inevitable but the money will help," Bill said. "Let's outline the whole sequence. We arrive at three set up by three fifteen start the show at three thirty have all presents by four and disappear into the sunset or at least the highway. No lingering around for cake or goodie bags no matter how tempting." "No lingering is key," Molly agreed. "We strike fast and funny. The parents will tell stories about the weird magicians for years. It's the perfect comedic heist. Are you ready to practice some magic lines like abracadabra your bicycle is now in our van?" "Ready as I'll ever be," Bill said. "Abracadabra the presents are gone. See I'm a natural but if this ends with angry parents chasing us in minivans it's on you Molly." They both laughed as they continued to refine their plan.
created_at
"2026-05-01T12:37:49.532Z"
5/1/2026, 12:37:49 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
111297
moderation_latency
(null)
generation_time
117557
tokens_prompt
77
tokens_completion
2011
native_tokens_prompt
184
native_tokens_completion
9369
native_tokens_completion_images
(null)
native_tokens_reasoning
8594
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0235853
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"18fe4149-9ca8-9481-bca4-f01ff1aa5848"
is_byok
false
latency
90
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777639069-4W55QtYUaAYAuhBrZGh7"
session_id
(null)
api_type
"completions"
id
"gen-1777639069-UM94gT6G6egQLqBVGRdb"
upstream_id
"18fe4149-9ca8-9481-bca4-f01ff1aa5848"
total_cost
0.0235853
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
677
0.00%
Dialogue to Total Word Ratio
Ratio: 94.24%, Deviation: 24.24%
neededClean
false
wordsTotal
677
wordsDialogue
638
0.0000%