NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR91NV173M385KM70TPBPX5D
Run Details
Model:
Grok 4.3
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
7.2s
Cost:
$0.001608
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her ski mask in the rearview mirror of the getaway van. “This is either the dumbest idea we’ve ever had or the easiest score of our lives.” Bill snorted, checking the lock-picking kit for the dozenth time. “A children’s party, Molly? We’re stealing presents from kids who think balloons are the height of luxury. Feels less Ocean’s Eleven and more ‘Ocean’s… middle school.’” “Hey, those parents drop serious cash on gift cards. We’ll clear out faster than the clown can twist a balloon animal. You got the rope?” “Got it. And the Santa suit. In case we need a quick disguise or, you know, to traumatize some toddlers permanently.” Molly rolled her eyes. “We’re not traumatizing anyone. We’re liberating gifts. Think of it as last-minute re-gifting. The kids will forget by next week and the parents will assume Great-Aunt Brenda pocketed everything. Everyone wins except Brenda.” Bill gestured to the map taped to the dashboard. “Front door’s wide open during cake. We slip in, fill the bags while the magician’s faking mind-reading tricks, and vanish before the magician realizes mind-reading doesn’t cover adult criminals. How’s the timing?” “Piñata hits at two-fifteen. That’s our window. Sugar-high chaos, everyone blindfolded and swinging sticks. We’ll be loading the van before anyone notices the loot’s gone.” “And if a parent spots us?” “Simple. Smile, offer them a juice box, act like volunteer elves. No one questions elves at a kids’ party. They’re too busy chasing sugar addicts wearing juice-box capes.” Bill chuckled, tapping the schematic of the living room. “I still can’t believe the birthday kid invited thirty screaming five-year-olds. His parents must hate themselves. Makes our job easier though. More witnesses, less attention. Everyone’s watching who’s about to puke, not the masked adults raiding the gift table.” Molly nodded, pulling on gloves. “Exactly. We smile, nod, and take everything with batteries included. Leave the cheap wrapping paper; too much bulk, not enough resale value. Focus on tablets, gaming consoles, and whatever’s still in the original box with a receipt taped to it.” Bill smiled grimly. “Receipts. I like a woman who thinks ahead. You sure about the escape route? Back alley past the bounce house?” “Guaranteed. Clowns and parents never check there. Too busy arguing over who owns the stray rubber chicken that just flew into the neighbors’ yard.” They sat in silence for a moment, listening to distant party noise already spilling from the house. Molly cracked her knuckles and whispered, “Ready when you are, Santa.” Bill adjusted the fake beard under his chin. “Ho, ho, here we come, kids. Merry felony to all, and to all a good heist.”
created_at
"2026-05-10T13:36:54.831Z"
5/10/2026, 1:36:54 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
429
moderation_latency
(null)
generation_time
6483
tokens_prompt
77
tokens_completion
747
native_tokens_prompt
182
native_tokens_completion
606
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
service_tier
"default"
usage
0.0016081
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"b10533bc-65a2-9e16-91d1-d9aa0cb6af99"
is_byok
false
latency
95
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778420214-XXTO2pb6yHJTkzrrvvWN"
session_id
(null)
api_type
"completions"
id
"gen-1778420214-gN9ZNA7h1yloAAvXgEIv"
upstream_id
"b10533bc-65a2-9e16-91d1-d9aa0cb6af99"
total_cost
0.0016081
cache_discount
0.0001344
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
440
0.00%
Dialogue to Total Word Ratio
Ratio: 81.11%, Deviation: 31.11%
neededClean
false
wordsTotal
450
wordsDialogue
365
0.0000%