NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZ31BAWFETCZ0FB4KA2WJW
Run Details
Model:
Grok 4.20 (Beta, Reasoning)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
18.6s
Cost:
$0.024908
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The aroma of freshly baked bread and sweet pastries filled the air in the small town bakery. Morning light streamed through the large windows, illuminating the cozy interior with its wooden tables and colorful cushions. Two women stood behind the counter, one kneading dough while the other polished the glass display case. "Have you seen the sheriff this morning?" "He came in early, asking about yesterday's customers." "All of them? That must have taken a while." "Particularly interested in who bought the chocolate eclairs." "The ones Mr. Higgins ate before he collapsed?" "Yes, the very same. The doctor says it wasn't natural causes." "So it's true then, it was murder." "Looks like it. And the poison was in the pastry." "But who would do such a thing in our quiet town?" "Someone with a grudge against the old man. He knew everyone's secrets." "Like the affair between the teacher and the mechanic." "Or the missing town funds." "We need to figure this out before more people get hurt." "Or before the bakery gets a bad name." "Exactly. Let's start by making a list of everyone who was here that day." "And see who had the opportunity to tamper with the food." "While we enjoy a cup of tea ourselves. Solving mysteries is thirsty work."
created_at
"2026-03-12T21:26:22.319Z"
3/12/2026, 9:26:22 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
17879
moderation_latency
(null)
generation_time
18503
tokens_prompt
81
tokens_completion
341
native_tokens_prompt
185
native_tokens_completion
4128
native_tokens_completion_images
0
native_tokens_reasoning
3869
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0249076
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"a0327e1c-e483-eab7-05f9-f7ed6a1bd2d5"
is_byok
false
latency
429
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350782-1opP8T54TbsaB7hgENdz"
upstream_id
"a0327e1c-e483-eab7-05f9-f7ed6a1bd2d5"
total_cost
0.0249076
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)