NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-12-01KKHZGHVX2DZAAMSFCCRH0PPJ
Run Details
Model:
Grok 4.20 (Beta, Reasoning)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
14.1s
Cost:
$0.018500
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
Morning light poured into the quaint small-town bakery, where the aroma of vanilla and warm butter created a soothing haven. Glass cases glowed with flaky turnovers and sugar-dusted muffins while the wooden floor creaked softly underfoot. In the lull before the midday rush, two friends stood close by the counter. "I never thought I'd see the day when a simple birthday cake would cause such a stir in our peaceful town." "Neither did I. But when the candles revealed not just wax but a hidden message, everything changed." "It was a warning, wasn't it? 'Watch your back' scrawled in red icing." "Precisely. And now the baker who created it has vanished without a trace, like sugar dissolving in hot water." "Do you suspect foul play from the rival shop across the street?" "Could be. They've been coveting our secret cinnamon bun recipe for years now." "We must search for clues in the old family recipe book. It might hold the answers we seek." "Yes, but we have to be discreet about it. These bakery walls have ears, especially with all the nosy townsfolk coming in and out." The oven timer buzzed suddenly, breaking their hushed conversation and leaving a faint chill amid the comforting scents of fresh bread.
created_at
"2026-03-12T21:33:45.22Z"
3/12/2026, 9:33:45 PM
model
"x-ai/grok-4.20-beta-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
13224
moderation_latency
(null)
generation_time
13945
tokens_prompt
81
tokens_completion
320
native_tokens_prompt
185
native_tokens_completion
3060
native_tokens_completion_images
0
native_tokens_reasoning
2808
native_tokens_cached
128
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0184996
router
(null)
provider_responses
0
endpoint_id
"33f49f24-a0de-4dcf-872d-ca514e83fbae"
id
"147801ba-b590-66de-4000-d0d3e21de9c6"
is_byok
false
latency
263
model_permaslug
"x-ai/grok-4.20-beta-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773351225-XA2FKV0mab3M98deyVuL"
upstream_id
"147801ba-b590-66de-4000-d0d3e21de9c6"
total_cost
0.0184996
cache_discount
0.0002304
upstream_inference_cost
0
provider_name
"xAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)