NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHSV7F4JV1WVXKVG0DQV8KR
Run Details
Model:
Grok 4.20 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
2m 9s
Cost:
$0.030113
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
95.21%
Dialogue to Total Word Ratio
(details)
47.6041%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Deep within an abandoned factory that served as their secret lair, Molly and Bill were surrounded by the tools of their trade. Lockpicks, ropes, and various disguises lay scattered across a large wooden table. The air was thick with the scent of oil and dust, but the two thieves paid little attention to their dingy surroundings. Instead, they focused intently on the task at hand, which involved a radical departure from their typical targets of banks and museums. This time, the target was a joyous children's birthday party taking place in a local hall, complete with balloons, streamers, and a massive pile of wrapped presents. Molly had been observing the family for weeks, noting their routines and the extravagance of their celebrations. Bill had been skeptical from the start but was slowly warming to the idea as the potential profits became clear. The comedic nature of the crime appealed to their sense of humor, even if it stretched the boundaries of their moral code as career criminals. "Bill, I've got the perfect plan for our next heist. We're going to break into that big children's party down the street and steal all the presents. It's brilliant because no one will expect it," Molly said. Bill rubbed his chin thoughtfully before answering. "Are you serious Molly? That's a new low. Stealing toys from kids? What kind of thieves are we?" Molly waved her hands excitedly as she continued to explain. "The kind that need cash. Plus it's comedic gold. We'll go as magicians and clowns. I'll distract the children with card tricks while you bag the gifts from the table." Bill paced back and forth across the concrete floor, his footsteps echoing in the large space. He stopped occasionally to look at the map Molly had drawn. "What about the adults? They will notice if the gift table is suddenly empty." Not to be deterred, Molly pulled out a small device from her pocket that would create the necessary distraction. "We'll time it during the cake cutting when all eyes are on the birthday boy. A little smoke from our fake candles and we are in the clear. Then it's straight to the fence with bags full of new toys." A smile slowly spread across Bill's face as he began to see the humor in the situation. He sat down again and leaned in closer to his partner in crime. "I guess if you think it will work I'm with you. But I get first pick of any video games we find in there." With the main points of the plan agreed upon, the duo spent the remaining hours of the evening going over every possible contingency. They practiced their clown walks and magic gestures until they were perfect. The warehouse rang with their shared laughter as they pictured the pandemonium that would ensue when the theft was discovered. Although it was a ridiculous scheme, it had all the hallmarks of their most successful past operations careful planning, unexpected disguises, and a quick escape. As the moon rose high in the sky outside, they felt confident that this heist would not only fill their pockets but also provide stories to tell for years to come. The children's party would be missing its presents, but Molly and Bill would be celebrating their own twisted version of success.
created_at
"2026-05-01T12:57:30.86Z"
5/1/2026, 12:57:30 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
123965
moderation_latency
(null)
generation_time
128453
tokens_prompt
77
tokens_completion
4231
native_tokens_prompt
184
native_tokens_completion
11980
native_tokens_completion_images
(null)
native_tokens_reasoning
11343
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0301128
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"de61218e-d7b1-9ba7-9bd1-93afd18a9a0c"
is_byok
false
latency
93
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777640250-0s4fwpdXqZUpzJ2WUxfj"
session_id
(null)
api_type
"completions"
id
"gen-1777640250-a27P0JZLXwhDT6UXMra5"
upstream_id
"de61218e-d7b1-9ba7-9bd1-93afd18a9a0c"
total_cost
0.0301128
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
555
95.21%
Dialogue to Total Word Ratio
Ratio: 29.01%, Deviation: 0.99%
neededClean
false
wordsTotal
555
wordsDialogue
161
47.6041%