NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZAFB9M7Q2CRCET8786V0Y
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
1.8s
Cost:
$0.001871
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The little bell above the bakery door gave a cheerful tinkle as Harriet stepped inside, the warm scent of cinnamon and vanilla wrapping around her like a hug. Sunlight streamed through the lace curtains, dancing across the glass display case filled with fresh scones and lemon bars. “Those look almost too pretty to eat this morning.” Harriet smiled, selecting two blueberry muffins. “Almost. But I’ve never let that stop me before.” A soft chuckle floated from behind the counter. “Good. Because I stayed up half the night perfecting that streusel. Tell me the truth now.” She broke off a piece and popped it into her mouth, eyes closing in bliss. “Like summer in a crumb topping. You’ve outdone yourself.” The baker leaned forward, lowering her voice. “Between us, I think Mrs. Pickering is going to accuse me of witchcraft again. She’s convinced nothing natural tastes that good.” Harriet’s eyes sparkled with mischief. “Let her. Last week she swore the mailman was a Russian spy. I’d rather be accused of delicious magic than paranoia.” “Speaking of which… you still think she’s the one who’s been leaving those strange notes in the library books?” Harriet slid her payment across the counter. “I’m beginning to wonder if we’re looking in the wrong place entirely.”
created_at
"2026-03-12T21:30:26.03Z"
3/12/2026, 9:30:26 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
372
moderation_latency
(null)
generation_time
1716
tokens_prompt
81
tokens_completion
351
native_tokens_prompt
183
native_tokens_completion
270
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0018708
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"e6ef5541-eb07-bb15-6c14-f050cf0658ef"
is_byok
false
latency
330
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351026-kvHGLueHPLDySC8JgXt1"
upstream_id
"e6ef5541-eb07-bb15-6c14-f050cf0658ef"
total_cost
0.0018708
cache_discount
0.0001152
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)