NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHYTM922P96DQQAHSNMCH2C
Run Details
Model:
Grok 4.20 (Beta)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
1.7s
Cost:
$0.001661
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell above the door tinkled as Mrs. Hargrove stepped inside, bringing the scent of rain with her. Warm cinnamon and sugar wrapped around them like a hug. “These lemon bars are drier than my late husband’s sense of humor. I thought you were improving the recipe.” “Improving it would mean removing the secret ingredient your late husband used to slip in when he thought I wasn’t looking.” A soft chuckle. “He always did like his bourbon. Still, the texture...” “Complain again and I’ll tell the whole town what you really put in your famous cherry pie filling.” A gasp, then laughter. “You wouldn’t dare, Margaret.” “Try me, Eleanor. Another lemon bar?” “Please. And maybe one of those blueberry muffins. The ones that don’t taste like regret.” “Only if you promise not to feed the crumbs to that ridiculous poodle of yours under my table again.” “Mr. Fluffles has excellent taste. Unlike some people I could mention.” The baker slid two pastries across the counter with a knowing smile. Rain pattered against the windows while the scent of fresh bread filled the cozy space between old friends.
created_at
"2026-03-12T21:21:46.791Z"
3/12/2026, 9:21:46 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
640
moderation_latency
(null)
generation_time
1581
tokens_prompt
81
tokens_completion
310
native_tokens_prompt
183
native_tokens_completion
235
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0016608
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"6cfbefbd-679a-eb19-7160-950327ff13db"
is_byok
false
latency
600
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773350506-rpfqEWtIPpLhMdbcBdnB"
upstream_id
"6cfbefbd-679a-eb19-7160-950327ff13db"
total_cost
0.0016608
cache_discount
0.0001152
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)