NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZB3CNHM0T0FSJ418DVCCE
Run Details
Model:
Grok 4.20 (Beta, Reasoning)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
10.8s
Cost:
$0.014186
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bakery's bell had long since fallen silent after the morning rush, leaving only the warm scent of cinnamon buns and fresh apple pies drifting through the sunlit space. Two women lingered by the wooden counter, one wiping flour from her hands, the other cradling a coffee mug. "Have you heard the latest about old Mr. Hargrove?" "No, what now?" "They found him in his garden, clutching one of your famous blueberry muffins." "That's terrible. Was it the same as the others?" "Yes, the symptoms match. Weakness, then collapse. The doctor thinks it might be deliberate." "But why use my muffins? Everyone loves them." "Someone's trying to ruin your reputation, or maybe frame you." "I can't believe this is happening in our peaceful town." "Remember when the library books went missing last month? This feels connected." "You think it's the same person?" "Possibly. That new fellow who moved in, the one with the fancy car. He's been buying lots of baked goods lately." "And asking questions about the town history." "We need to be careful. If we start poking around, we might be next." "Don't worry, we'll figure this out over another cup of coffee." The women smiled weakly, but their eyes showed the determination of those who wouldn't let mystery go unsolved in their beloved town.
created_at
"2026-03-12T21:30:46.554Z"
3/12/2026, 9:30:46 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
9905
moderation_latency
(null)
generation_time
10428
tokens_prompt
81
tokens_completion
348
native_tokens_prompt
185
native_tokens_completion
2341
native_tokens_completion_images
0
native_tokens_reasoning
2074
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0141856
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"cc64c573-14c9-da64-0821-40650e474147"
is_byok
false
latency
245
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351046-zoXomfww3s6kpoem71Ul"
upstream_id
"cc64c573-14c9-da64-0821-40650e474147"
total_cost
0.0141856
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)