NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHRCCE9KHR9YNE57RVGDT4D
Run Details
Model:
Grok 4.3 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
2m 21s
Cost:
$0.029260
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8722%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly leaned over the dashboard of their candy-striped van, binoculars glued to her face as she scoped out the chaotic children's birthday party in the community hall across the street. Balloons bobbed in the wind, and the sound of laughter and music floated through the open windows. Bill sat beside her, polishing a pair of fake glasses that were part of his disguise. "Molly, this job gives me the heebie-jeebies more than the bank vault gig we pulled last month," Bill muttered, adjusting his fake mustache. "Stealing from kids feels wrong, even if the presents are from their rich uncles who won't miss them." Molly lowered the binoculars and turned to him with a sly smile. "Wrong? Bill, those presents are loaded with the good stuff – latest gadgets, toys that cost a fortune. Think of it as liberating them from a life of clutter. Besides, we need the cash for our next big score in the city. You in or out?" Bill shifted uncomfortably in his seat. "I'm in, but the plan needs polishing. How are we getting past the front door without looking like creepers? Those bouncy houses are everywhere, and parents are eagle-eyed." "Here's the beauty of it," Molly replied enthusiastically. "We pose as the hired help for the entertainment. You be the balloon sculptor who just happens to start creating a ladder from twisted balloons to reach the 'magic' upper shelf where the big presents are stored. I distract with my ventriloquist dummy that 'talks' the kids into playing a game away from the gifts." Bill chuckled at that. "A ventriloquist dummy? That could work if the kids are young enough to believe the dummy is real. But what if one of them pulls on your strings and reveals our secret walkie-talkies for communication?" Molly waved that concern away. "Easy fix. Pretend it's all part of the show. Say the dummy is malfunctioning and needs calibration, then retreat to bag the loot while everyone watches the show. Once we have the bags, we signal each other with the squirting flower codes." "Okay, but there's another snag: the piñata. When those kids start swinging wildly for the candy, we'll be in the crossfire of flying candies and sticks. I nearly lost an eye on a similar job at the carnivals." "Timing is everything, Bill. We wait until the piñata is smashed and everyone is scrambling on the floor for treats. That's our window to slip in, load up, and slip out. The parents will be too busy playing cleanup crew to notice two 'helpers' making off with the extra supplies." Bill nodded slowly, starting to get into the spirit. "And for the escape, we use the van's hidden compartment under the seats for the smaller items, tossing the big ones in the back. If chased, we can always claim it's a joke prank on the birthday kid arranged by the parents themselves." They both chuckled. "It's going to work this time." Molly said. "Stay focused and we'll be out very soon enough."
created_at
"2026-05-01T12:31:55.853Z"
5/1/2026, 12:31:55 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
134683
moderation_latency
(null)
generation_time
140423
tokens_prompt
77
tokens_completion
3120
native_tokens_prompt
184
native_tokens_completion
11639
native_tokens_completion_images
(null)
native_tokens_reasoning
11017
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0292603
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"54f3bbe7-ecb0-c5d2-9e37-5977b6e5b486"
is_byok
false
latency
60
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777638715-RKKfJCNIZYWOToVRSJ8B"
session_id
(null)
api_type
"completions"
id
"gen-1777638715-EbUd7B6Xi53ZeSIF2g9u"
upstream_id
"54f3bbe7-ecb0-c5d2-9e37-5977b6e5b486"
total_cost
0.0292603
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
504
0.00%
Dialogue to Total Word Ratio
Ratio: 77.56%, Deviation: 27.56%
neededClean
false
wordsTotal
508
wordsDialogue
394
49.8722%