NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0G9Q8VW6XSTBFTG2PP8RD8
Run Details
Model:
Claude Opus 4.7 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
16.0s
Cost:
$0.028415
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
0.01%
Dialogue to Total Word Ratio
(details)
43.1938%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like old pepperoni and regret. Molly unfolded a crinkled map of the Hendersons' backyard across the dashboard while Bill squeezed into a rented clown suit three sizes too small. "Bill, you look like a sausage with ambitions." "The suit was the only one left at the costume shop. It was this or a sexy pilgrim." "I would have paid cash to see the pilgrim." "Focus, Molly. What's the plan?" She tapped the map. "Party starts at two. Cake at two-thirty. Piñata at three. That's our window." "Why the piñata?" "Because when thirty sugar-crazed seven-year-olds are swinging a bat blindfolded, nobody's watching the gift table." "That's either genius or the saddest sentence I've ever heard." "Can be both. You distract the parents with balloon animals." Bill held up his hands. "I don't know how to make balloon animals." "Then make balloon disasters. Nobody expects artistry from a man stuffed into polyester like a Christmas ham." "And what are you doing while I'm out there suffocating?" "I'll be the face painter. Kids line up, parents drift off for wine, I slip through the sliding door, grab the haul, out the side gate. Ninety seconds, tops." "Ninety seconds for a whole gift table?" "I've been training, Bill. I can clear a Target checkout lane in forty-two." "That's not training, that's a felony résumé." "Semantics." She handed him a red foam nose. "Put this on." "It smells like someone else's nose." "That's because it was someone else's nose. Focus. What's your clown name?" "I didn't pick a clown name." "Bill. BILL. Every clown needs a name. It's how they track you if things go sideways." "Fine. Mr. Jingles." "That's a horror movie rat." "Sparkles McGee." "Better. Tragic, but better." Bill adjusted his enormous shoes. "What if a kid recognizes me? I did Timmy's birthday last April." "You robbed Timmy's birthday last April." "I also did the magic show. I'm multifaceted." "Nobody recognizes a man in a clown suit, Bill. It's the great equalizer. You could be a senator in there." "A senator with a suit this tight would have regrets." "Everyone has regrets. Me, I regret partnering with a man who brought a fanny pack to a crime." "It's tactical. It holds my snacks." "Criminals don't have snacks, Bill." "This criminal does. There's a KitKat in here with your name on it if we pull this off." She paused. "Okay, that's actually motivating." A minivan rolled past, packed with balloons and a cake box the size of a small moon. "That's our cue," Molly said. "Remember. Balloons. Distract. Ninety seconds." "And if it goes wrong?" "We meet at the Denny's on Route 9. Order pancakes. Act normal." "Nobody acts normal at a Denny's at three PM." "Exactly. We'll blend right in." Bill honked his red nose once, solemnly, like a man going to war. "For Sparkles McGee," he whispered. "For the KitKat," Molly said, and slid the van door open.
created_at
"2026-04-24T19:43:34.688Z"
4/24/2026, 7:43:34 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
780
moderation_latency
(null)
generation_time
15917
tokens_prompt
77
tokens_completion
797
native_tokens_prompt
98
native_tokens_completion
1117
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.028415
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"c94cc0e4-9913-4507-bd94-4a490d74f134"
is_byok
false
latency
780
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777059814-4G83mbqxILYq2u6UPqtB"
session_id
(null)
api_type
"completions"
id
"gen-1777059814-Hr195MroLXnuvXyaP5M2"
upstream_id
"c94cc0e4-9913-4507-bd94-4a490d74f134"
total_cost
0.028415
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
489
0.01%
Dialogue to Total Word Ratio
Ratio: 83.81%, Deviation: 13.81%
neededClean
false
wordsTotal
494
wordsDialogue
414
43.1938%