NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0G9R0PBXGC73ACDK2AFC37
Run Details
Model:
Claude Opus 4.7 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
15.1s
Cost:
$0.029765
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
35.00%
Matches word count
(details)
70.69%
Dialogue to Total Word Ratio
(details)
52.8473%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back booth of Donut Emperor smelled like burnt coffee and broken dreams, which Molly felt was appropriate for the occasion. "Absolutely not," Bill said, folding his arms across his windbreaker. "I have standards, Molly." "You stole a kidney from a veterinarian last March." "That was a *cat* kidney, and it was for science." "It was for eighty dollars." "Science is expensive." Bill took a sip of coffee and grimaced. "I'm not robbing a seven-year-old." Molly slid a glossy photograph across the table. Bill glanced down, then looked again, slower. "Is that—" "A mountain of presents. Thirty-two RSVPs. Rich kid. Dad owns those car washes off the interstate." "The ones with the singing octopus?" "The very same." Bill whistled. "That octopus has seen things." "Focus, Bill." He tapped the photograph. "What's the theme?" "Pirates." "Oh, come *on*." "I know." "That's practically an invitation. That's the universe telling us—" "I *know*, Bill." He leaned back, fingers steepled. "Walk me through it." "Two o'clock, the magician arrives. Guy named The Amazing Reginald." "Is he amazing?" "He's mediocre at best, but the kids don't know that. At two-fifteen, he does the disappearing rabbit. That's our window." "How long's the window?" "Four minutes. Maybe five if he drops the rabbit again." "Does he usually drop the rabbit?" "Bill, I've watched nine of his birthday parties on YouTube. He drops the rabbit sixty percent of the time." "That's commitment." "That's *research*." Bill squinted. "And the presents are where?" "Sunroom. Stacked on a table shaped like a treasure chest, because again, pirates." "And we just—walk in?" "We walk in as the bouncy castle inflation crew." "Molly." "Yes." "There is no bouncy castle inflation crew." "There is now." She produced two polo shirts from her tote bag. Embroidered across the chest, in cheerful yellow thread: *BOUNCE DADDIES LLC.* Bill stared at the shirt for a long moment. "I will not wear this." "You will absolutely wear this." "What if a parent asks a question?" "You say PSI, nod gravely, and walk away." "PSI." "Pounds per square inch. Bouncy castle talk." "I don't know bouncy castle talk." "Neither do they, Bill. That's the point." He sighed the sigh of a man who has lost many arguments in this booth. "And the getaway?" "Minivan. I borrowed Ted's." "Does Ted know?" "Ted is on a spiritual retreat in Taos and will not know until Thursday." "Molly, I want to be clear. If a child cries, I leave." "No child will cry. We'll be in, out, and halfway to the storage unit before Reginald finds his rabbit." "And if the rabbit finds us first?" "Then we improvise." Bill picked up the polo shirt. He held it at arm's length, like it might bite. "Bounce Daddies." "Bounce Daddies." "I hate you." "You love me." "I love the car wash money." He stood, pulling the shirt over his head. "Let's go ruin a birthday."
created_at
"2026-04-24T19:43:35.45Z"
4/24/2026, 7:43:35 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
801
moderation_latency
(null)
generation_time
15082
tokens_prompt
77
tokens_completion
820
native_tokens_prompt
98
native_tokens_completion
1171
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.029765
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"7492afa1-2d1a-4ff9-99aa-bb7e833803ea"
is_byok
false
latency
801
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777059815-rWfj0sw8r3C3Q792gmoh"
session_id
(null)
api_type
"completions"
id
"gen-1777059815-leXII5Y7xM3W5103kKMD"
upstream_id
"7492afa1-2d1a-4ff9-99aa-bb7e833803ea"
total_cost
0.029765
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
35.00%
Matches word count
n/a
neededClean
false
words
482
70.69%
Dialogue to Total Word Ratio
Ratio: 72.63%, Deviation: 2.63%
neededClean
false
wordsTotal
486
wordsDialogue
353
52.8473%