NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8W4YAKJ7M488S1KXE4XSD0
Run Details
Model:
Grok 4.3
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
11.5s
Cost:
$0.001703
Provider:
xAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“You think this will work, Bill?” Molly rose from the crate behind the bakery dumpster and checked the zipper on her hot-dog costume. “Course it will, they’ll never suspect us,” Bill said, tugging at the strings of his mustard-squirting backpack. “Kids love mascots, and mascots love neglected gift piles beside the bouncy castle.” Molly laughed, then quickly lowered her voice. “But what if some parent recognizes us from last time at the water park?” “They got the wrong description,” Bill answered, “tall, angry, and dressed like condiments? Nobody remembers us—the kids just screamed for ketchup pictures.” Molly snapped open the copy of the invitation she’d swiped from a mailbox. “Says the party starts at two, magician at two-thirty, cupcakes at three. That’s our window: while the magician is pulling rabbits from a hat, we pull boxes from under the table.” Bill nodded, scribbling a quick floor plan on the back of a ketchup packet. “Front gate’s the easiest, but we keep clear of the piñata line—no witnesses with bats. We head straight past the trampoline, circle the gift table once, grab anything light enough to fit in the costumes, then out through the side fence by the pool.” “What if the clown blocks the gate again?” Molly asked. “Last time that giant shoe tripped me and nearly sent me into the chocolate fountain.” Bill snorted. “We’ll tell him we’re the backup entertainers. Say the agency double-booked us. He gets confused easy—remembers names like ‘Mustard Man’ but forgets faces.” Molly stuffed two black garbage bags into the mustard nozzle of Bill’s pack. “These’ll hold the haul. I’ll aim for anything with big bows first; they’re usually the good stuff. You handle small boxes—jewelry, gift cards, cash envelopes, the boring junk adults get kids to pretend they like.” Bill tapped the packet with his pen. “And timing: when the magician asks for a volunteer, that’s our cue. Everyone’s eyes forward, hands busy applauding. In, out, gone.” They practiced their story once more while fastening Velcro name tags. “If the birthday kid points and says, ‘Aren’t you the hot dog from the park?’” Molly prompted. “We say, ‘Sure, kiddo, but today we’re here for your special day, sprinkles and all.’” “That’s believable?” Bill asked. “Believable enough before the magician’s rabbit takes the blame,” Molly said. “We’ll be in the parking lot stuffing bags by then.” Bill gave the costume zipper one last tug. “Right, we better move. Cupcakes start at three, and I don’t want frosting evidence in the getaway car.” Molly hoisted the empty bags higher on her shoulder. “Let’s go steal some loot nobody will miss until they’re unwrapping VHS tapes and toothbrushes tomorrow.”
created_at
"2026-05-10T12:00:18.266Z"
5/10/2026, 12:00:18 PM
model
"x-ai/grok-4.3-20260430"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
765
moderation_latency
(null)
generation_time
11371
tokens_prompt
77
tokens_completion
747
native_tokens_prompt
182
native_tokens_completion
617
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
service_tier
"default"
usage
0.0017028
router
(null)
provider_responses
0
endpoint_id
"45623cb8-18e5-4b32-8fc4-85d05439982d"
id
"1bcd1ee6-765c-96d2-9308-585cc65108ef"
is_byok
false
latency
358
model_permaslug
"x-ai/grok-4.3-20260430"
provider_name
"xAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778414418-TcUGsTZPgGYju0ClOAe7"
session_id
(null)
api_type
"completions"
id
"gen-1778414418-ZdNUx93cKhbO5YdHceOI"
upstream_id
"1bcd1ee6-765c-96d2-9308-585cc65108ef"
total_cost
0.0017028
cache_discount
0.0000672
upstream_inference_cost
0
provider_name
"xAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
447
0.00%
Dialogue to Total Word Ratio
Ratio: 72.73%, Deviation: 22.73%
neededClean
false
wordsTotal
451
wordsDialogue
328
0.0000%