NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZ4MKE5SN5S5HV172JB5S3
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
2.2s
Cost:
$0.001516
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bakery smelled of warm cinnamon and fresh bread, sunlight spilling through lace curtains onto the worn oak counter. "I still can't believe she found that note tucked in the flour sack. Who even hides clues in a bakery?" "Mrs. Hargrove always did have a flair for the dramatic. Remember her prize-winning pie at the fair last summer? She practically accused the judges of sabotage." "True, but this feels different. That handwriting looked shaky, like someone was in a hurry. Or scared." "Scared in Willow Creek? The biggest crime we've had was Mr. Peterson's missing cat. Though come to think of it, that cat did have a rather suspicious expression." "You're not suggesting Mr. Whiskers is involved in a conspiracy, are you?" "Stranger things have happened between these flour-dusted walls. Pass me that danish, will you? Thinking requires proper sugar levels." The bell above the door tinkled softly as a customer entered, but neither woman turned. They stood shoulder to shoulder, two pairs of eyes fixed on the crumpled paper between them, the sweet scent of vanilla suddenly feeling just a touch too cloying.
created_at
"2026-03-12T21:27:14.804Z"
3/12/2026, 9:27:14 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
395
moderation_latency
(null)
generation_time
2043
tokens_prompt
81
tokens_completion
292
native_tokens_prompt
183
native_tokens_completion
230
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0015156
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"fd622743-fdb8-da1d-44b4-be11e4ad8118"
is_byok
false
latency
359
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350834-uwt4lv6pPyVODmn7l1ek"
upstream_id
"fd622743-fdb8-da1d-44b4-be11e4ad8118"
total_cost
0.0015156
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)