NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KPZV637EBS4CZT17709X0WE6
Run Details
Model:
Claude Opus 4.7
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
16.6s
Cost:
$0.029465
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the sticky diner table, smoothing it past a syrup stain shaped vaguely like Australia. Bill leaned in, chewing a pen cap with the intensity of a man defusing a bomb. "Okay," Molly said. "The target is one Tyler Brennan. Turning seven on Saturday. His mother rented out the entire Jump-N-Bounce Fun Palace." "Inflatables," Bill muttered. "My old nemesis." "Focus. Guest list is thirty-two kids, twelve parents, two clowns, and a guy in a dinosaur costume who I think is just the dad's unemployed cousin." "Security?" "One teenage birthday hostess named Brittany. She's sixteen, makes nine bucks an hour, and according to her TikTok, she hates her job with the fire of a thousand suns." "She's flippable," Bill said, nodding. "We offer her twenty bucks and a ride home, she'll hand us the goodie bags herself." "We're not corrupting a minor, Bill." "Fine. Morally ambiguous plan B. What about the presents?" Molly tapped a rectangle on the blueprint. "Gift table, back corner, near the ball pit. Everything gets piled there when guests arrive. Unsupervised for roughly ninety minutes during pizza and cake." "Ninety minutes is a lifetime in thievery." "It's a lifetime in a ball pit too. I once lost a shoe in one for six years." Bill looked up. "Metaphorically?" "Literally. Anyway. We arrive at two fifteen, dressed as—" She slid a photograph across the table. Bill stared at it. "No." "Yes." "Molly, I am not dressing as a clown." "You're dressing as a clown." "I have dignity." "You stole a kayak last month using only a snorkel and a library card. You do not have dignity." Bill sighed the sigh of a man who had made peace with many things. "Fine. What are you going to be?" "Face painter. I've got the kit, I've got the smock, and I can do a passable butterfly." "And the getaway?" "Minivan. Rented it under the name Deborah Hinkle." "Who's Deborah Hinkle?" "Nobody. That's the beauty of it." Bill scratched his jaw. "What about the parents? Thirty-two kids means a lot of phones recording. One viral video and we're on a morning show in handcuffs." "Which is why," Molly said, producing a small plastic device from her purse, "we jam the Wi-Fi. No uploads, no livestreams, no evidence. By the time anyone realizes the presents are gone, we're halfway to Toledo." "Why Toledo?" "I have a guy in Toledo." "You have a guy everywhere, Molly." "That's called a network, Bill. That's called professionalism." He folded the blueprint carefully. "One question. Ethically speaking—" "Don't." "—are we really the kind of people who steal from a seven-year-old?" Molly sipped her coffee. "Tyler Brennan's dad is a hedge fund manager who foreclosed on my aunt's bakery. Tyler is getting an iPad, a drone, and a check for two grand from Grandpa." Bill considered this. "Robin Hood rules?" "Robin Hood rules." "I'll get the clown shoes."
created_at
"2026-04-24T13:34:35.77Z"
4/24/2026, 1:34:35 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
689
moderation_latency
(null)
generation_time
16541
tokens_prompt
77
tokens_completion
785
native_tokens_prompt
98
native_tokens_completion
1159
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.029465
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"27ef1374-664e-49b4-a0a7-8e41c7b92cbf"
is_byok
false
latency
689
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777037675-v2lIg67FgNV4AT5JCjqA"
session_id
(null)
api_type
"completions"
id
"gen-1777037675-pbvwzHRVW98zrPxUDRoE"
upstream_id
"27ef1374-664e-49b4-a0a7-8e41c7b92cbf"
total_cost
0.029465
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
0.00%
Dialogue to Total Word Ratio
Ratio: 76.22%, Deviation: 26.22%
neededClean
false
wordsTotal
492
wordsDialogue
375
30.1376%