NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8Z3XYSZ06418S9T3CDZ2BW
Run Details
Model:
Grok 4.3
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
6.8s
Cost:
$0.001570
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
6.41%
Dialogue to Total Word Ratio
(details)
3.2069%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pulled Bill aside behind the birthday clown's storage truck. "Bill, you got the layout yet?" she hissed, eyes darting across the inflatables. Bill nodded, sliding a crumpled map between two cake boxes. "Back gate’s open till six. Pinata goes up at five thirty. We hit it right when the parents do the group photo—every kid facing the camera, all eyes off the gift table." Molly tapped the paper. "Fine, but we need less than ninety seconds. You pull the sheet under the table, I load the bag. If one kid turns around, we abort. I’m not wrestling a kindergarten class over iPads." Bill smirked. "You’re still sore about last year’s juggling fiasco, huh? Told you, never trust trained poodles." Molly glared. "Focus. The wrapping paper disguises the bag handles. You remember the code phrase if security shows?" Bill straightened his bow tie. "Yeah—'Balloon animals smell suspicious today.' If I say it twice, you vanish with the loot and I loop the playground until they forget me." Molly glanced at her watch. "Timer starts when the magician fumbles the first card trick. Parents film everything then. You cut the string on the pony ride banner, I sweep the east stack." Bill chuckled. "The pony? Really? Those kids’ll scream for hours." Molly grinned despite herself. "Exactly. Chaos buys us seconds. You got the exit vehicle prepped?" Bill jerked his thumb toward the parking lot. "Ice-cream truck’s unlocked, keys in the visor. Nobody checks behind the sprinkles bin." Molly exhaled. "And if a toddler bursts into tears when we grab their dinosaur wrapping—" Bill shrugged. "We toss him a spare balloon. Instant distraction. Kids fall for that every time." Molly rolled her eyes. "You sound way too practiced at this. Last time you handed out bubbles and the parents started filming you like a hero." Bill’s grin widened. "That’s called market research. Besides, the real trick is not looking guilty. Smile like we belong." Molly checked the sparkler fuses again. "Belong long enough to reach the back door. After that it’s pure sprint. You locked the van doors?" Bill nodded. "Triple check. Last thing we need is some four-year-old joyriding our getaway ride." Molly sighed, checking the time once more. "Alright. On my signal—three whistles—go. And Bill?" She paused, lowering her voice. "If anything feels off, we walk. Presents aren’t worth orange jumpsuits. You copy?" Bill raised two fingers in salute. "Loud and clear, partner. Now stop worrying—we’re not stealing from banks, just future spoiled adults. Piece of birthday cake."
created_at
"2026-05-10T12:52:10.848Z"
5/10/2026, 12:52:10 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
544
moderation_latency
(null)
generation_time
6718
tokens_prompt
77
tokens_completion
649
native_tokens_prompt
182
native_tokens_completion
564
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
service_tier
"default"
usage
0.0015703
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"de3028a9-40fc-9d4e-bd8f-1469b995cb3a"
is_byok
false
latency
141
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778417530-quxNPFky1mQcxY2ZJXmM"
session_id
(null)
api_type
"completions"
id
"gen-1778417530-dMZ3802IvOVrDkBjvsoy"
upstream_id
"de3028a9-40fc-9d4e-bd8f-1469b995cb3a"
total_cost
0.0015703
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
422
6.41%
Dialogue to Total Word Ratio
Ratio: 77.41%, Deviation: 7.41%
neededClean
false
wordsTotal
425
wordsDialogue
329
3.2069%