NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR916KWSFTZESPNNPTTVQ2RR
Run Details
Model:
Grok 4.3
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
13.0s
Cost:
$0.001560
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind the rhododendron bushes in Mrs. Pritchard’s back garden, the sound of squealing five-year-olds drifting through the open patio doors. Paper lanterns bobbed above a castle-shaped bouncy castle, and the table was piled with brightly-wrapped boxes that winked under the fairy lights. Molly adjusted her black ski mask and whispered, “I still think this is mental.” “Which part exactly?” Bill replied, tugging at the too-small clown costume he’d liberated from a charity shop earlier that afternoon. “Stealing from six-year-olds on a Saturday afternoon. I can practically feel karma lining up behind us with handcuffs.” Bill shrugged and peered around the bush. “Karma doesn’t bring snack platters or pass-the-parcel prizes. That table’s worth at least two months’ rent if we fence the good ones quick.” “Exactly. Good ones. Half of them are just boxes with socks inside. We’ll be knee-deep in glitter and disappointment.” “So we skip the socks,” Bill said, as though this solved the moral dilemma. “We go for anything that rattles or has a decent bow count. Easy in, easy out. I’ve timed the music game. Every time the grown-ups belt out ‘Happy Birthday,’ everyone’s distracted. That’s our window.” Molly rolled her eyes. “And what if some little terror recognises you from last week’s balloon delivery? You promised you’d retire after the bouncy-house gig.” Bill grinned, teeth white against the painted red smile. “I promised I’d retire once we could afford real retirement. This score might do it, if you stop worrying about the ethics of toy theft.” From the house came the first notes of the birthday song. Molly exhaled sharply and muttered, “Fine. Go for the big red dinosaur first; it looks heavy and expensive. I’ll sweep the Lego and the battery toys while you cause a diversion with the leftover cake.” Bill’s grin widened. “Distraction by dessert. Classic.” Molly pulled her mask lower and checked the stopwatch on her phone. “Thirty seconds until they finish singing. After that the kids stampede back to the bouncy castle. We move then, grab what we can carry, and leg it through the side gate before anyone notices the pile has shrunk.” “Or before Mrs. Pritchard realises she hired a pair of forty-year-old clowns who can’t even juggle,” Bill added cheerfully. Molly shot him a look that could have ignited the paper hats. “Let’s just get this done without traumatising anyone under four foot.” “Deal,” he said, and together they rose from the bushes as the final off-key note of “you” drifted across the lawn.
created_at
"2026-05-10T13:28:36.002Z"
5/10/2026, 1:28:36 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
562
moderation_latency
(null)
generation_time
12642
tokens_prompt
77
tokens_completion
705
native_tokens_prompt
182
native_tokens_completion
560
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
service_tier
"default"
usage
0.0015603
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"a771c076-f87e-9aff-87a7-084948649880"
is_byok
false
latency
131
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778419716-F11CRtVteXZ8RJgGWa1J"
session_id
(null)
api_type
"completions"
id
"gen-1778419716-bJ9F9OfeVAi85JhiFSqa"
upstream_id
"a771c076-f87e-9aff-87a7-084948649880"
total_cost
0.0015603
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
420
0.00%
Dialogue to Total Word Ratio
Ratio: 60.69%, Deviation: 30.69%
neededClean
false
wordsTotal
435
wordsDialogue
264
0.0000%