NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR91ACPB2KD3A75VSKSQZZ7P
Run Details
Model:
Grok 4.3
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
18.2s
Cost:
$0.001796
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly shoved her way into the empty storage room behind the balloon vendor’s table, kicking the door shut with the heel of her boot. Bill was already crouched by the service ladder, running a finger over a scratched map he’d drawn on the back of a pizza box. “We hit the cake table at exactly four-fifteen,” she whispered. “Every kid will be too busy licking frosting to notice two extra clowns slipping past with empty gift bags.” Bill tilted the pizza box so the overhead fluorescents hit the diagram. “Clowns are a fifty-fifty risk. Half these parents already hate clowns. One scream and we’re ankle-deep in screaming rugrats.” He tapped an X near the gift table. “Back door behind the bouncy castle—more direct, less birthday hats on our skulls.” Molly yanked a black-and-white striped shirt from her duffel. “The shirts scream circus, not creepy. Besides, I like the honk-honk.” She slipped the shirt over her head, then adjusted the red foam nose in her fist. “You worry too much about noise. We’re not stealing Weasley’s wand set; we’re taking three dozen gift bags stuffed with cheap plastic dinosaurs. Parents will thank us by Tuesday when the kids forget whose present was supposed to go to whom.” A string of party poppers popped somewhere on the other side of the door and the music hiccupped. Bill flinched but kept studying the map. “Timer says four-ten for us to roll up the gifts and bolt. If the magician starts his disappearing act early, our window shrinks like cheap balloons.” He looked up, eyes narrowed. “You sure about the giant bow on the bags? Makes them look like actual presents, not loot.” Molly grinned, snapping the bow around the bag’s neck. “That’s the charm, Bill. Walk out like we’re delivering more gifts instead of nabbing them. One confused grandma asks where we’re taking the loot, we smile, mutter something about gift exchange, and keep walking. Worst case, we’re faster than any parent chasing us while holding a plate of half-eaten cake.” Bill shook his head, folding the pizza box flat. “You keep banking on adult confusion. One dad with a phone camera and we’re trending as ‘Party Crashers of the Year.’” He stuffed a coil of rope into his cargo pocket. “I say we take the utility stairs instead. Two flights straight to the loading dock. Less foot traffic than racing past the magician’s table.” Molly glanced at the glowing digital clock on her wrist. “Four-oh-five. We’ve got ten minutes to look festive and ten more to make it downstairs. You in or you backing out over a clown phobia?” She offered him the second red nose, the foam slightly squished from travel. Bill accepted the nose but flipped it between his fingers like he was debating one last protest. “Clown shoes squeak. One wrong squeak and we broadcast the whole escape.” He pocketed it anyway, then shouldered his bag. “Fine. Utility route, coordinated honking only if absolutely necessary.” Molly cracked the door an inch. Laughter and the smell of frosting drifted in like war paint. She looked back with a quick smirk. “Then let’s go steal a birthday or three. Fifteen minutes, no survivors—except the presents.”
created_at
"2026-05-10T13:30:39.698Z"
5/10/2026, 1:30:39 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
462
moderation_latency
(null)
generation_time
18098
tokens_prompt
77
tokens_completion
833
native_tokens_prompt
182
native_tokens_completion
681
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
service_tier
"default"
usage
0.0017956
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"f1f23956-65da-9649-b294-08a8c8c2c6fc"
is_byok
false
latency
98
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778419839-29Gct2q0b2AgFhGEyIly"
session_id
(null)
api_type
"completions"
id
"gen-1778419839-GKnjebmnYCWITQyIAKlP"
upstream_id
"f1f23956-65da-9649-b294-08a8c8c2c6fc"
total_cost
0.0017956
cache_discount
0.0001344
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
536
0.00%
Dialogue to Total Word Ratio
Ratio: 58.27%, Deviation: 28.27%
neededClean
false
wordsTotal
544
wordsDialogue
317
0.0000%