NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHYYVYVD0BXH01RT0SW9S5J
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
2.4s
Cost:
$0.001504
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell above the bakery door jingled softly as lavender-scented steam curled from the ovens. Warmth wrapped around them like a favorite cardigan. "These scones have too much lemon today. I swear Ethel changed the recipe just to spite me." "You say that every Tuesday, and every Tuesday you buy three anyway." A long pause filled only by the soft clink of a ceramic plate being set down. "Well, maybe I like knowing what to expect from my enemies. Unlike some people who keep secrets about where they were last Thursday night." "I was at the library." "Until midnight? With the lights off?" Another pause, longer this time. The smell of cinnamon rolls suddenly seemed sharper. "Have you been following me again?" "I prefer the term 'casually observing my neighbors.' It's a small town, Margaret. Things get noticed." "Like how Mr. Hargrove's prize-winning roses all died the same week his wife started wearing new pearl earrings?" The coffeemaker hissed in the corner. Outside, rain began tapping gently against the windowpanes. "Another scone?" "Make it two this time."
created_at
"2026-03-12T21:24:05.727Z"
3/12/2026, 9:24:05 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
293
moderation_latency
(null)
generation_time
2266
tokens_prompt
81
tokens_completion
291
native_tokens_prompt
183
native_tokens_completion
228
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0015036
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"d64d4597-2a63-d510-a9ec-195e085909ca"
is_byok
false
latency
238
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350645-sBk8zlbxrEJnb70MgtQ3"
upstream_id
"d64d4597-2a63-d510-a9ec-195e085909ca"
total_cost
0.0015036
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)