NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8Z5TWHVWFW7YSAY3PCDW8J
Run Details
Model:
Grok 4.3
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5.7s
Cost:
$0.001740
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9995%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
As the drizzle pattered against the van’s windshield, Molly squinted at the crumpled party map in her lap. “Tonight’s the big one, Bill,” she muttered, her eyes flicking between parked cars and the distant glow of fairy lights on the front lawn. Bill chuckled from the driver’s seat without taking his foot off the brake, the engine still idling while they both watched the house through cracked sunglasses. “Think of it, Mol,” he said softly. “All those gift bags lined up under the piñata like unattended treasure chests. No alarms, no lasers, just some balloons and confused parents refilling punch.” Molly shook her head, tapping a rhythm on the steering wheel that wasn’t hers. “We’re not here for candy. We take the big ones first—the ones wrapped in silver paper—then hit the PlayStation box before the magician shows up. Got the sacks ready?” Bill nodded toward the backseat where two pillowcases already reeked of stale popcorn. “Checked them twice. One for the haul, one for distractions.” He paused and grinned sideways at her. “You sure you want to wear the clown wig again? Last time the balloon animal kid tried to adopt you.” Molly snorted but didn’t answer, instead tracing a route on the map with her finger. “Front door stays open because some genius mom thinks everyone’s her best friend. We slip around the side, grab three bags each, and be back before anyone notices the cake’s missing a slice. You create the diversion with the fake lost phone call. I do the grab. Simple.” “Simple,” Bill echoed, but his fingers drummed the wheel faster. “And if the guard dog decides the clown looks tasty?” Molly shrugged. “Then you owe me a new wig and an alibi. Either way, we still get the loot.” She folded the map with a crisp snap and tucked it into her jacket. Outside, children’s laughter floated through the rain, sounding innocent and expensive at once. Bill exhaled, finally killing the engine. “Timer starts when the first balloon pops,” he said. “We move on the second pop. Count them. No heroics.” Molly’s grin was wide and brief. “Heroics are for people without pillowcases. We’re professionals.” They sat another full minute watching the driveway. A parent rushed by with a tray of cupcakes, oblivious. Then Molly reached back, grabbed one sack, and handed the other to Bill without looking at him. “Last chance to back out,” she offered, voice low. “Stealing toys from tiny people feels like kicking a pinata that already paid admission.” Bill slipped the sack onto his shoulder anyway. “Only difference is the pinata fights back,” he answered. “These presents won’t. Doors open in three minutes. You ready?” Molly checked the small digital watch on her wrist, already counting down to the first balloon. “Born ready,” she said. “Just don’t laugh when I trip over the bouncy castle wires again.” Bill started the engine once more, waiting for the perfect pop. The wait felt shorter than it should have.
created_at
"2026-05-10T12:53:13.242Z"
5/10/2026, 12:53:13 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
390
moderation_latency
(null)
generation_time
5531
tokens_prompt
77
tokens_completion
778
native_tokens_prompt
182
native_tokens_completion
632
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
service_tier
"default"
usage
0.0017403
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"ff87e938-a7d0-9055-a722-1d333b171bdd"
is_byok
false
latency
152
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778417593-bg7xmLgTgXZcrGeZRsOY"
session_id
(null)
api_type
"completions"
id
"gen-1778417593-379qod6bWtOuYpAEHITp"
upstream_id
"ff87e938-a7d0-9055-a722-1d333b171bdd"
total_cost
0.0017403
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
499
0.00%
Dialogue to Total Word Ratio
Ratio: 49.30%, Deviation: 19.30%
neededClean
false
wordsTotal
499
wordsDialogue
246
49.9995%