NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHSP2XZX7Q6WAEQ4XFYYHKG
Run Details
Model:
Grok 4.20
Test:
Tool usage within Novelcrafter
Scenario:
Create alternate prose sections
Duration:
7.0s
Cost:
$0.001078
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
Fail
Match blue prose section
(details)
Fail
Match green prose section
(details)
Fail
Match red prose section
(details)
no eval
Overall
Model input
System
You are an expert fantasy novel author. You are working on a novel called "The Chronicles of the Last Dragon". The following tools are available to you: - <nc-section> Use this tool to create a new section of prose. Arguments: - "data-title": The title of the section. - "data-color": Any color of (red, green, blue, yellow, purple, orange, pink, brown). Example usage: <nc-section data-title="Chapter 1: The Arrival" data-color="blue"> Frederick, a young squire, arrives at the castle to begin his training. </nc-section>
User/Human
The story so far: <storySoFar> Frederick, a young squire, arrives at the castle to begin his training. </storySoFar>
AI
My horse, a majestic stallion named Thunder, galloped across the open field. The wind whipped through my hair as I rode towards the castle gates. The sun was setting, casting a warm glow over the stone walls. I could see the banners of the kingdom fluttering in the breeze.
User/Human
Continue the story with the following instructions: """The local knight, Sir Roland, greets Frederick near the stables.""" Create 3 alternate prose sections that continue the story. - Version 1 (red) - Version 2 (green) - Version 3 (blue)
Model response
Messages
Meta
AI
**Version 1 (red)** As Frederick dismounted near the weathered stables, the clatter of his boots on the cobblestones drew the attention of a towering figure emerging from the shadows. Sir Roland, the local knight whose reputation for unyielding discipline preceded him, approached with measured strides, his armor catching the dying light of the sunset. “Ho there, young squire,” he called, his voice deep and resonant like rolling thunder. “You must be Frederick. The stable master spoke of your arrival. Come, let me see the measure of the lad who dares train beneath these walls.” **Version 2 (green)** Thunder snorted and stamped as Frederick guided him toward the stable yard, the scent of hay and leather thick in the evening air. Before he could loosen the saddle, a broad-shouldered knight stepped out from behind the timber doors, wiping grease from his hands with a rag. Sir Roland’s keen eyes appraised the newcomer from beneath a scarred brow. “Frederick, I presume,” he said, offering a callused hand in greeting. “I am Sir Roland. The castle doesn’t take kindly to latecomers or weak wills—best you understand that now if you hope to last beyond the first frost.” **Version 3 (blue)** The great wooden gates of the inner bailey creaked open as Frederick led his horse along the torch-lit path to the stables. A lone knight waited there, arms crossed over a chest plate etched with the kingdom’s sigil. Sir Roland regarded the young squire with a mixture of curiosity and stern appraisal, the fading sunlight glinting off the silver at his temples. “Well met, Frederick,” he said, his tone carrying the weight of years spent on the battlefield. “I’ve been told you seek to become more than you are. Leave your mount and walk with me—words spoken near horses are often truer than those spoken in halls.”
created_at
"2026-05-01T12:54:42.38Z"
5/1/2026, 12:54:42 PM
model
"x-ai/grok-4.20-20260309"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
391
moderation_latency
(null)
generation_time
6906
tokens_prompt
318
tokens_completion
476
native_tokens_prompt
409
native_tokens_completion
388
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
384
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00107805
router
(null)
provider_responses
0
endpoint_id
"3ecee37f-b217-4093-87fb-aaf0afe307af"
id
"66c9d737-d7a0-9528-8361-2a4a9f2acfe8"
is_byok
false
latency
118
model_permaslug
"x-ai/grok-4.20-20260309"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777640082-qD69k27jxXFn0ii65mPZ"
session_id
(null)
api_type
"completions"
id
"gen-1777640082-7LpwA4dR8IzR7fMU83Rw"
upstream_id
"66c9d737-d7a0-9528-8361-2a4a9f2acfe8"
total_cost
0.00107805
cache_discount
0.0004032
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
Fail
Match blue prose section
/<nc-section data-title="[^"]*" data-color="blue">[^<]+<\/nc-section>/
n/a
Fail
Match green prose section
/<nc-section data-title="[^"]*" data-color="green">[^<]+<\/nc-section>/
n/a
Fail
Match red prose section
/<nc-section data-title="[^"]*" data-color="red">[^<]+<\/nc-section>/
n/a
no eval