NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHYZSSG0T6A9WRN7TPRECFX
Run Details
Model:
Grok 4.20 (Beta, Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 33s
Cost:
$0.128946
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
3.06%
Dialogue to Total Word Ratio
(details)
1.5279%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill were in the van going over the plan for the umpteenth time as they sipped coffee to calm their nerves. The children's party was in full swing across the street with screams of delight and party games underway. Balloons floated in the air and the cake was probably already being served. "Okay Bill, we are going in as the entertainment. You keep the kids and parents busy with your hilarious clown act and I will swipe all the presents from the table." "My clown act? I thought I was the lookout," Bill responded, his voice filled with surprise and a bit of fear as he looked out the window at the happy scene. "Plans change. This is better. The presents are the target and this way we get them all without anyone being the wiser." "But what if a kid wants to play with me the whole time? I can't steal and entertain at the same time." "You can and you will. It's only for fifteen minutes. Then we are out of there with the bags full of toys and games." "Fifteen minutes is a long time to keep a bunch of kids entertained. What if I run out of ideas?" "You won't. Just do the classic routines. Make balloon animals and tell bad jokes. The laughter will cover the noise of me taking the presents." "Bad jokes are my specialty. Okay, I am convinced. But our getaway needs to be quick. The van better be ready." "It is ready with the engine on and the doors unlocked. We throw the bags in and drive away calmly like nothing happened." "Calmly driving away with a van full of stolen kids presents. This is the best and worst idea you have ever had Molly." "It is brilliant and you know it. Now let's get those red noses on and become the clowns that stole Christmas or in this case the birthday." "Christmas in July. I like it. Let's go." They stepped out of the van and walked toward the party, their hearts pounding but their faces showing excitement. The plan was simple yet filled with potential for disaster but that was what made it so funny to them. As authors of their own comedic heist story they laughed at the absurdity. The presents were as good as theirs. The sun was beginning to set, casting a golden glow over the neighborhood. They felt a mixture of guilt and thrill as they approached the gate, knowing that their actions would create a story that would be told for years to come in their circle of friends.
created_at
"2026-03-12T21:24:36.277Z"
3/12/2026, 9:24:36 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
90923
moderation_latency
(null)
generation_time
92456
tokens_prompt
77
tokens_completion
605
native_tokens_prompt
184
native_tokens_completion
21468
native_tokens_completion_images
0
native_tokens_reasoning
20972
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.1289456
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"378f61e5-71f6-45ba-d3c8-0469c943bc34"
is_byok
false
latency
271
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350676-d6aK2Tx9z9Bqw987cdry"
upstream_id
"378f61e5-71f6-45ba-d3c8-0469c943bc34"
total_cost
0.1289456
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
437
3.06%
Dialogue to Total Word Ratio
Ratio: 58.35%, Deviation: 8.35%
neededClean
false
wordsTotal
437
wordsDialogue
255
1.5279%