NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0GDXAHN6BZRMX61FG0MMCZ
Run Details
Model:
Claude Opus 4.7 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
14.9s
Cost:
$0.028290
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
2.01%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
1.0058%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back booth of Denny's had seen better crimes, but none more ambitious. Molly unrolled a napkin across the table like it was the blueprint to Fort Knox. In reality, it was a crayon sketch of the Peterson residence, complete with a stick-figure clown she'd labeled "THREAT LEVEL: UNKNOWN." "Okay," she said, tapping the napkin. "Party starts at two. Cake at two-thirty. Presents at three." Bill squinted. "Why are we hitting a kid's birthday party, Molly?" "Because nobody expects it." "Because it's insane." "Because," she hissed, leaning forward, "little Timmy Peterson's dad is a hedge fund guy. You know what rich kids get? iPads. Drones. Those tiny electric Mercedes things. One of his cousins gifted him a Rolex last year. A ROLEX, Bill." Bill sipped his coffee thoughtfully. "Fine. What's the play?" "You go in as the magician." "No." "Bill." "Absolutely not. I don't do balloon animals." "You don't need to do balloon animals, you need to distract thirty sugar-drunk eight-year-olds while I load the loot into the van." "Thirty?" "Approximately." "Molly, the last time I was around thirty eight-year-olds was at my nephew's communion, and I came out with a black eye and no wallet." She waved him off. "This time you'll have a wand. It's a whole different dynamic." Bill set down his mug. "And what are you going as?" "The caterer." "You can't cook." "I don't need to cook. I just need to wheel a cart past the gift pile and make it look heavy going out. Pizza in, presents out." "That's actually not terrible." "Thank you." "It's still bad. But it's not terrible." A waitress drifted by and Molly covered the napkin with both hands like she was hiding state secrets. When the coast was clear, she produced a second napkin. "Exit strategy," she announced. Bill read it upside down. "Is that... a bouncy castle?" "We cut a hole in the back. Stuff the presents inside. Deflate it. Roll it up. Walk out with a bouncy castle." "Molly." "What." "That's beautiful." "I know." "That's honestly poetry." "I KNOW." He leaned back, beaming, then slowly his face fell. "Wait. What about the clown?" Molly's eye twitched. "The clown is a variable." "A variable." "A contingency." "You have no plan for the clown." "Clowns respect other performers, Bill. It's a professional courtesy. You'll nod at him, he'll nod at you, and you'll both silently agree to stay out of each other's lanes." "That is the stupidest thing you have ever said, and you once tried to rob an aquarium." "The otters had it coming." Bill rubbed his face. "If I end up in a piƱata-related incident, I'm blaming you." "Noted." Molly stuffed the napkins into her purse and slid out of the booth. "Saturday. Two o'clock. Don't forget the wand." "Molly." "Yeah?" "We're going to hell." "Probably. But we're going with a Rolex."
created_at
"2026-04-24T19:45:51.956Z"
4/24/2026, 7:45:51 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
872
moderation_latency
(null)
generation_time
14857
tokens_prompt
77
tokens_completion
781
native_tokens_prompt
98
native_tokens_completion
1112
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.02829
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"dd3dc987-7318-4b82-af4a-9c61fe3eb044"
is_byok
false
latency
872
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777059951-0RybKi1WSrrf93H790bf"
session_id
(null)
api_type
"completions"
id
"gen-1777059951-esumX9OoLKgnMDxD91gj"
upstream_id
"dd3dc987-7318-4b82-af4a-9c61fe3eb044"
total_cost
0.02829
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
2.01%
Matches word count
n/a
neededClean
false
words
475
0.00%
Dialogue to Total Word Ratio
Ratio: 71.84%, Deviation: 21.84%
neededClean
false
wordsTotal
483
wordsDialogue
347
1.0058%