NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHR96XMQZKCRF5W7PJK86XT
Run Details
Model:
Grok 4.3 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4m 18s
Cost:
$0.054203
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
98.86%
Dialogue to Total Word Ratio
(details)
99.4318%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The evening air was filled with the scent of cotton candy and popcorn as Molly and Bill surveyed the extravagant children's birthday party from their hidden vantage point in the shadowy alleyway beside the venue. The backyard had been converted into a miniature amusement park complete with inflatable slides, a clown-free zone ironically avoided by their disguises, and a towering gift table laden with colorful boxes that glittered under the fairy lights. Bill's plan was bold yet ridiculous, fitting for their line of work in comedic heists. They had cased the party for days, noting the loose security from distracted parents and the opportunity during the height of the revelry. Molly clutched her fake birthday card as a prop, ready to act the part of an aunt if spotted. The heist required precision timing to avoid the pitfalls of screaming children, territorial moms, and the ever-present threat of a piñata stick swinging wildly. Bill wiped sweat from his brow, his nerves showing despite the thrill. Everything hinged on their quick entry and exit, turning the festive chaos into their advantage. They dressed in casual attire that could pass for party guests, complete with name tags they had forged from the invitation list stolen earlier. Molly reviewed the blueprint she had sketched on a napkin, marking the paths through the food tables that could provide cover if things went south. Parents were too busy taking photos and managing tantrums to notice two outsiders in the mix. The music blared loudly enough to cover any noises they might make while moving the presents to the van. Bill made sure their tools were hidden under their jackets including the special sacks for quick loading of the stolen goods. This heist will be one for the books if they pulled it off without a hitch or an ice cream spill ruining their outfits. The party featured a magician and clowns besides the gifts making it a prime target for the pair. They were now ready for anything very much. Bill turned and said, "This gig is perfect, Molly. We dress as parents blend in then forget our gift and grab the whole lot while they are distracted by the magician and his amazing tricks." Molly replied, "Not that simple. The magician is a retired cop. One wrong move and he will handcuff us before the cake. Sneaking with truckload of gifts is not stealthy with toddlers grabbing at shiny things." Bill said, "We need a diversion like releasing the balloons or a food fight with desserts." Molly said, "That will bring every parent running and blow our cover completely. Our best shot is during the piñata bashing when the blindfolded kids swing around wildly. Chaos city no one notices us slipping in and out with the loot in pillowcases we brought." Bill said, "Pillowcases it is. But what if a kid spots us stealing Santa's stash and we end up stuck in the timeout corner with angry parents calling the cops on us all?"
created_at
"2026-05-01T12:30:11.897Z"
5/1/2026, 12:30:11 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
251360
moderation_latency
(null)
generation_time
256605
tokens_prompt
77
tokens_completion
1052940
native_tokens_prompt
184
native_tokens_completion
21643
native_tokens_completion_images
(null)
native_tokens_reasoning
21064
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0542031
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"f0b5257a-6cdc-9ae7-8b5d-42b2d01232d8"
is_byok
false
latency
157
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777638611-J1uxHaFZ27zmBW3OJKN9"
session_id
(null)
api_type
"completions"
id
"gen-1777638611-X3gT1SCOWLkAJCwx4qCe"
upstream_id
"f0b5257a-6cdc-9ae7-8b5d-42b2d01232d8"
total_cost
0.0542031
cache_discount
0.0001344
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
98.86%
Dialogue to Total Word Ratio
Ratio: 30.48%, Deviation: 0.48%
neededClean
false
wordsTotal
502
wordsDialogue
153
99.4318%