NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZAAMBMP1HY5WCQDZT4PJT
Run Details
Model:
Grok 4.20 (Beta, Reasoning)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
13.1s
Cost:
$0.018038
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The small town bakery was alive with the scent of fresh pastries and warm bread. Sunlight streamed in through the large windows, casting a golden glow on the checkered tablecloths. Two friends sat at a small table in the corner, sharing a pot of tea and a plate of cookies as they discussed the latest town scandal. "Did you hear what happened to old Mr. Fletcher's prized recipe book?" "No, what? Is it missing?" "Vanished from his kitchen last night. He thinks someone broke in, but nothing else was touched." "That sounds like an inside job. Who would want an old recipe book?" "Perhaps it's not about the recipes. Rumors say there's a map hidden inside the cover, leading to buried treasure from the gold rush days." "You're joking. In our little town? But it would explain why the new neighbor has been asking so many questions about local history." "Exactly. We need to find that book before it falls into the wrong hands." "Let's start by talking to the bakery delivery guy. He was at Mr. Fletcher's yesterday." "Good idea. These lemon bars are delicious, by the way. Want another?" "Yes, please. Nothing like sugar to fuel our detective work." They laughed softly, careful not to draw attention from the other patrons enjoying their morning treats.
created_at
"2026-03-12T21:30:21.2Z"
3/12/2026, 9:30:21 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
12705
moderation_latency
(null)
generation_time
12994
tokens_prompt
81
tokens_completion
334
native_tokens_prompt
185
native_tokens_completion
2983
native_tokens_completion_images
0
native_tokens_reasoning
2719
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0180376
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"2b877bbb-7b5e-5723-2135-b1822155f0b8"
is_byok
false
latency
357
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351021-TciKzAjghwXyrTONOyxr"
upstream_id
"2b877bbb-7b5e-5723-2135-b1822155f0b8"
total_cost
0.0180376
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)