NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZS33HTS4Y49NPCBW35Z55
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
1.7s
Cost:
$0.001720
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell above the bakery door gave a cheerful jingle as Harriet stepped inside, the scent of warm cinnamon rolls wrapping around her like an old friend. “Marjorie, have you seen the new flyers? Someone’s been tacking them up all over town.” “I saw one this morning, stuck right on the lamppost outside. ‘Justice for Mr. Peabody.’ As if that cranky old cat didn’t live to seventeen on nothing but stolen scones.” Harriet chuckled, sliding onto a stool at the counter. “Seventeen and mean as a hornet. Still, someone poisoned him. That’s not right.” “Poisoned? In Willow Creek? Next you’ll tell me Mrs. Langley’s prize-winning pie was tampered with too.” A warm tray of blueberry muffins slid across the glass case. “Try one before you start accusing half the town. And tell me again why you’re poking around in this at all. You’re retired, Harriet.” “Because Mr. Peabody used to sit on my windowsill every Tuesday, and because nobody else seems to care that a cat was murdered in our sleepy little town.” The baker wiped her hands on her apron, eyes twinkling. “Then I suppose we’d better figure out who hated that cat enough to kill him. Coffee?”
created_at
"2026-03-12T21:38:25.019Z"
3/12/2026, 9:38:25 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
296
moderation_latency
(null)
generation_time
1615
tokens_prompt
81
tokens_completion
322
native_tokens_prompt
183
native_tokens_completion
264
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0017196
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"5134e9d7-962c-c9a7-2ca1-22b17ebf8c9c"
is_byok
false
latency
259
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351505-KDNe2b8LHMC1OPe8UwtW"
upstream_id
"5134e9d7-962c-c9a7-2ca1-22b17ebf8c9c"
total_cost
0.0017196
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)