NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8XDT86NZ2YJRKC4N66TSPV
Run Details
Model:
Grok 4.3
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
13.5s
Cost:
$0.001885
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
43.1909%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill elbowed the microwave. “Molly, for the love of shiny things, this job’s a joke. We’re not thieves—we’re deranged elves on a sugar bender. Stealing from a birthday shindig? That’s robbery with extra sprinkles.” Molly set down her clipboard and tapped it like a metronome. “Bill, listen. The place has a bouncy castle. A bouncy castle. That means two dozen distracted parents, zero security cameras, and exactly forty-three present-shaped distractions all wrapped in Zoo Friends wrapping paper. We go in dressed as balloon animals and we leave richer than Santa.” “Balloon animals?” Bill snorted. “I’m six-feet-four. The last balloon animal I tried popped and took my dignity with it.” “Wear the giraffe suit. Tall neck hides the crowbar. I’ll be the unicorn—my horn already doubles as a pick. We hit the gift table during the cake smash. Parents’ll be too busy crying over smashed frosting to notice us swapping loot for decoys.” Bill rubbed his temples. “Decoys of what? Empty boxes? We’re gonna leave the kids with disappointment and one very confused piñata?” “Exactly. Disappointment is part of growing up. Consider it tough love.” Molly flicked a page. “Exit strategy: the chocolate fountain. We dunk the presents in a wheelbarrow we painted like a dinosaur, wheel it straight out the back gate while the DJ plays the conga. No one questions a dancing dinosaur full of mystery bubbles.” “And the alarms?” “Party alarms? It’s a smoke machine and someone’s aunt singing ‘Happy Birthday.’ We’ll set off the fog, slip out in the haze. By the time anyone notices the presents have vanished, we’re two tables away at Chili’s celebrating.” Bill paced. “What if the kids mob us? Tiny hands are sticky and relentless. One kid latches on, the whole herd follows. We could get trampled by Velcro sneakers.” Molly grinned. “That’s why I bought the industrial-sized bubble machine. Release bubbles, kids chase bubbles, we chase presents. Classic misdirection. Plus bubbles photograph terribly—parents won’t even post evidence.” Bill stopped pacing. “You’ve thought of everything except the moral weight of traumatizing thirty preschoolers on their special day.” “Bill, they’re already destined for therapy. I’m just giving them a story to tell: ‘Remember the dinosaur that ate the presents?’ It’ll make them stronger. Think of us as character builders.” He stared at her. “Character builders with felony charges.” “Only if we get caught. We won’t. The plan is flawless—until one of us trips over a pool noodle.” Molly raised an eyebrow. “Just don’t trip. The giraffe suit has Velcro knees; you’ll bounce right back into the bouncy castle and no one will suspect a thing.” Bill sighed, defeated. “Fine. Unicorn and giraffe it is. But if a four-year-old bites my ankle, I’m blaming you and keeping the toy trucks for therapy bills.” “Deal,” Molly said, already sketching the wheelbarrow dinosaur’s nostrils. “Now help me paint the claws. We want this thing fierce, not festive.” They both stared at the half-finished mural, the faint scent of spray paint curling like birthday candle smoke, already tasting the loot and the inevitable regret.
created_at
"2026-05-10T12:22:37.585Z"
5/10/2026, 12:22:37 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1003
moderation_latency
(null)
generation_time
13408
tokens_prompt
77
tokens_completion
869
native_tokens_prompt
182
native_tokens_completion
690
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
service_tier
"default"
usage
0.0018853
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"d135cbb5-b717-9818-b1f3-61411b46e071"
is_byok
false
latency
110
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778415757-yUD7mJSLvDCfjbEYzGuc"
session_id
(null)
api_type
"completions"
id
"gen-1778415757-AMWBkA2O84Yr9EkAS09G"
upstream_id
"d135cbb5-b717-9818-b1f3-61411b46e071"
total_cost
0.0018853
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
511
0.00%
Dialogue to Total Word Ratio
Ratio: 84.97%, Deviation: 14.97%
neededClean
false
wordsTotal
519
wordsDialogue
441
43.1909%