NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHSKN3D0ZHDH9NATQBNNF1X
Run Details
Model:
Grok 4.3 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
9m 35s
Cost:
$0.056320
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
99.34%
Dialogue to Total Word Ratio
(details)
99.5440%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill hunkered down in their hideout, the plans for the party heist spread before them. The room was filled with maps and sketches of the birthday venue, complete with guest lists and gift estimates. They needed to be careful because one wrong move could lead to disaster involving angry parents and crying children. "Bill, this is going to be tricky but hilarious," Molly said. "We sneak in as special guests. You handle the entertainment while I bag the gifts during the chaos. The presents are the prize, all wrapped up and ready for us to claim." Bill replied, "Entertainment? Like juggling or what? I could do magic but my tricks always fail. How about a puppet show to keep the kids busy?" "That's lame," Molly laughed. "Better: we start a food fight with the cake. Everyone's distracted, we load the presents into garbage bags labeled 'donations'." Bill snickered. "Donations? The parents might buy it. But what about cameras? Some parties have those." "No cameras," Molly assured. "It's old school. No phones allowed, supposedly. We time it for when the clown arrives. We replace him." Bill objected, "But the kids expect the clown. If I dress up, I'll have to make balloon animals fast." "You'll manage," Molly encouraged. "We bring a big sack. While Bill the clown distracts, Molly the helper collects the 'unwanted' toys." Bill grinned. "Unwanted toys? Good line. Then we escape through the kitchen to the van. I've got the engine running low." "Perfect," Molly agreed. "If a parent asks, say we're the gift wrappers cleaning up. Practice that." Bill said, "'We're the gift wrappers cleaning up, ma'am. Taking these to the car for safety.' Sounds good? Maybe add a bow or two to make it believable." "Yes," Molly nodded. "Add 'the kids are having so much fun we hid some extra for later.' It buys time." Bill continued, "And if they follow, we hit the emergency: pull the fire alarm disguised as a party popper. The noise will send everyone running and we blend into the panic." Molly added, "Great idea. No time for that though. The whole operation in five minutes flat. Ready to go? We don't want to linger too long and risk getting caught." After they finished practicing their alibis, Molly checked the clock and confirmed the timing for the party which was scheduled to start soon. They gathered their supplies including glue for disguises and bags for the haul, making sure everything was in place for a smooth operation without any hiccups from the neighborhood watch or unexpected guests arriving early. Bill said, "Born ready. Let's steal those presents and make this the funniest heist yet. I can already see the look on those kids' faces when they find out later. One more thing, what if the kids recognize us from the neighborhood? We might need fake names like Mr. Clownster and Ms. Toytaker to throw them off the scent completely." They shook hands firmly and smiled.
created_at
"2026-05-01T12:53:22.674Z"
5/1/2026, 12:53:22 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
559644
moderation_latency
(null)
generation_time
574614
tokens_prompt
77
tokens_completion
5699
native_tokens_prompt
184
native_tokens_completion
22463
native_tokens_completion_images
(null)
native_tokens_reasoning
21830
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0563203
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"223382ad-dd9e-0697-45f9-13b376a99c28"
is_byok
false
latency
64
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777640002-JbI4kWBoygA6hjRIfGVj"
session_id
(null)
api_type
"completions"
id
"gen-1777640002-7bG9YWc20zMyVdJkY2FB"
upstream_id
"223382ad-dd9e-0697-45f9-13b376a99c28"
total_cost
0.0563203
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
496
99.34%
Dialogue to Total Word Ratio
Ratio: 70.36%, Deviation: 0.36%
neededClean
false
wordsTotal
496
wordsDialogue
349
99.5440%