NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-01-01KQHSQZ1AFGGGX1VZW9Z9C3R2
Run Details
Model:
Grok 4.3 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
1m 56s
Cost:
$0.027138
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
97.40%
Dialogue to Total Word Ratio
(details)
98.6986%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly turned to Bill. "Bill, this children's party is our ticket to easy riches. We slip in dressed as the hired helpers, replace all those wrapped boxes with our glued-shut empty ones, and escape while the little brats are distracted by face painting. Nobody will suspect two clumsy clowns in big shoes of being professional thieves. The parents will be too busy taking photos to catch us in the act." Bill whispered back, "Molly, you know I'm with you but what if they ask us to sing happy birthday or something embarrassing like that. I can't carry a tune and the last time I tried it was a disaster. Plus, breaking into kids' events might bring bad luck or worse, helicopter parents." "Don't overthink it, Bill," Molly responded. "We'll claim the cake is on fire to create chaos, grab the loot, and hightail it out. The presents alone could fetch us hundreds on eBay, think of the new gadgets and games we get to keep instead of selling some." "Alright Molly, I'm in but next time we should target a bigger event with fancier presents like video game consoles and designer toys to make even more money," Bill agreed firmly.
created_at
"2026-05-01T12:55:43.919Z"
5/1/2026, 12:55:43 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
113817
moderation_latency
(null)
generation_time
115437
tokens_prompt
77
tokens_completion
2826
native_tokens_prompt
184
native_tokens_completion
10790
native_tokens_completion_images
(null)
native_tokens_reasoning
10550
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0271378
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"1b0cc9c4-1a27-9242-9a2c-b538a49f395a"
is_byok
false
latency
162
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777640143-OwH781rRErrZCJijXii8"
session_id
(null)
api_type
"completions"
id
"gen-1777640143-92wQHovjRHyKrKoK7rbl"
upstream_id
"1b0cc9c4-1a27-9242-9a2c-b538a49f395a"
total_cost
0.0271378
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
97.40%
Dialogue to Total Word Ratio
Ratio: 94.03%, Deviation: 4.03%
neededClean
false
wordsTotal
201
wordsDialogue
189
98.6986%