NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZ1DWTRSBJSW2QNBHYJQZZ
Run Details
Model:
Grok 4.20 (Beta, Reasoning)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
14.9s
Cost:
$0.020420
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The small town bakery was a haven of warmth and sweetness, with the scent of vanilla extract and melting chocolate wafting from the kitchen. Sunbeams illuminated the vintage posters on the walls and the fresh flowers on the tables. Two women stood near the window seat, sipping tea and pondering recent events in their quiet community. "Have you heard the whispers about the librarian's sudden illness?" "Yes, and it's strange because she only ate a slice of lemon cake before it happened." "Lemon cake from this very bakery? That can't be right." "Exactly. I baked it myself and everything was perfectly normal." "Do you suspect foul play? Maybe someone is trying to ruin your reputation." "It seems likely. Especially after the argument she had with the mayor last week." "The mayor? He's been acting odd lately, avoiding everyone." "Perhaps the cake was switched. We should look for witnesses at the library event." "I already asked the boy who delivers our goods. He mentioned seeing a shadowy figure near the back door." "That settles it. This is no accident. Let's gather more information discreetly." "Discreetly, yes. In a town this small, one wrong word and the culprit will disappear." They nodded in unison, the cozy atmosphere of the bakery now buzzing with the underlying current of mystery and intrigue.
created_at
"2026-03-12T21:25:29.643Z"
3/12/2026, 9:25:29 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
14137
moderation_latency
(null)
generation_time
14753
tokens_prompt
81
tokens_completion
351
native_tokens_prompt
185
native_tokens_completion
3380
native_tokens_completion_images
0
native_tokens_reasoning
3116
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0204196
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"60f7108a-ffcd-752e-e5c0-142688f677a8"
is_byok
false
latency
268
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350729-3D1n7nlOdfPk4qjiV50W"
upstream_id
"60f7108a-ffcd-752e-e5c0-142688f677a8"
total_cost
0.0204196
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)