NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8VM8R520EYDCT7JAG3QSFR
Run Details
Model:
Grok 4.3
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
6.1s
Cost:
$0.001280
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Did you hear that, Molly? They’re setting up the bouncy castle right outside the back door," Bill whispered, peeking through the hedge. "You're sure about this, Bill? It’s a bunch of five-year-olds hyped up on sugar," Molly hissed back, crouched beside him. "Exactly! The grown-ups will be chasing screaming kids while we slip inside. The presents are all stacked in the spare bedroom. Easy in, easy out." "Easy? What if one of those little monsters catches us with the loot?" "They’ll think it’s part of a clown act. We bring balloons, remember?" "Balloons? That’s your grand plan?" "Keeps it thematic. Every kid loves a balloon. We grab the presents, tie them to the strings, disappear." "Or we get tackled by tiny wrestlers in party hats. Again." "That was one time, Molly, and the hat barely fit me." "Still hurt my pride. Right, so we wait for the cake-cutting hour. Lights low, everyone distracted. You boost me through the window." "Got the crowbar?" "Inside the donut box. Looks innocent enough, just another sugar delivery." "You thinking ahead, I like it. But what if the dog starts barking?" "Mrs. Henshaw doesn’t allow pets. She says they shed on the cake. We’re clear." "Unless one of the dads is secretly a security guard moonlighting as party magician. Those guys have eagle eyes." "Then you distract him with your unbeatable balloon-animal skills. Make him a poodle. He’ll forget everything else." "You overestimate my artistry, Mol. Last time I tried wings on a snake." "Whatever. I just want to be out before they sing Happy Birthday. That song gives me hives." "We’ll hit the window at exactly 3:10. Cake’s sliced, chaos peaks, we’re golden." "And if the loot is heavier than we planned?" "Improvised sled. We ride the presents down the drive on the wrapping paper." "You’ve been watching too many cartoons." "Research, Molly. Pure research."
created_at
"2026-05-10T11:51:11.888Z"
5/10/2026, 11:51:11 AM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
585
moderation_latency
(null)
generation_time
5946
tokens_prompt
77
tokens_completion
519
native_tokens_prompt
182
native_tokens_completion
448
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
service_tier
"default"
usage
0.0012803
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"924f16bf-7f22-053f-b83b-d7a7eb927162"
is_byok
false
latency
108
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778413871-uW0AvciCTNwDzZ6Pmo5R"
session_id
(null)
api_type
"completions"
id
"gen-1778413871-UJmKFVZOCzDBg1sEJigh"
upstream_id
"924f16bf-7f22-053f-b83b-d7a7eb927162"
total_cost
0.0012803
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
311
0.00%
Dialogue to Total Word Ratio
Ratio: 96.20%, Deviation: 26.20%
neededClean
false
wordsTotal
316
wordsDialogue
304
0.0000%