NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KPZV4TE7C9A5X4HC40YY6MEN
Run Details
Model:
Claude Opus 4.7
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
16.5s
Cost:
$0.028765
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
34.0511%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the sticky diner table, narrowly avoiding a puddle of maple syrup. Bill leaned in, squinting, a piece of bacon dangling from his mouth like a very disappointing cigar. "Okay," she said, tapping the paper. "The target is 1428 Maplewood Drive. Birthday girl turns seven on Saturday. Theme is unicorns." "Unicorns," Bill repeated. "I hate unicorns." "Your feelings about mythological horses are not relevant to the heist, Bill." "They're relevant to me." Molly ignored him. "Party starts at two. Presents pile up on the dining room table through three. That's our window." "And how," Bill asked, chewing thoughtfully, "do we get in? Last I checked, two adults without children tend to stand out at a seven-year-old's birthday party." "We're the entertainment." Bill stopped chewing. "No." "Yes." "Absolutely not." "I already rented the costumes." "Molly, I swear to God—" "You're Sprinkles the Clown. I'm Madame Rainbow, the face-painting fairy." Bill set down his bacon with the solemnity of a man laying a loved one to rest. "I have a felony conviction. I have three teeth that aren't technically mine. I have a tattoo of a scorpion on my neck that says 'Mom's Regret.' And you want me to be Sprinkles." "The costume covers the tattoo." "Does it cover my soul?" "Bill." "Fine." He picked the bacon back up. "How do we get the presents out? I'm not shoving a pink bicycle down my clown pants." "That's what the van is for. I park it around back at two-thirty. You do balloon animals in the living room to keep the kids distracted. I sneak through the kitchen, out the side door, and start loading." "What about the parents?" "Parents at these things are either day-drinking in the kitchen or hiding in the bathroom pretending to take important calls. They won't notice." "And if they do?" "Then you start juggling." "I can't juggle." "Then learn by Saturday." Bill rubbed his temples. "I just want to go on record. When we started this life of crime, I pictured something more... dignified. Banks. Yachts. A Swiss vault, maybe." "We tried the yacht, Bill. You got seasick on a paddleboat." "That was a fluke." "You threw up on a duck." "The duck was aggressive." Molly rolled up the blueprints. "Look, we clear eight thousand dollars in toys, minimum. Resale on Facebook Marketplace. Nobody suspects a clown." "Everybody suspects a clown. That is the defining trait of clowns." "Children don't." "Children are the worst." "Which is exactly why we're doing this." She slid a photograph across the table. A small girl in a tiara, surrounded by wrapped boxes stacked nearly to the ceiling. "Her dad's a hedge fund guy. These kids get drones now, Bill. Drones." Bill studied the photo, then sighed the sigh of a man making peace with his fate. "Fine. But I'm keeping the drone." "Deal." "And Molly?" "Yeah?" "If anyone asks, Sprinkles is my cousin."
created_at
"2026-04-24T13:33:54.005Z"
4/24/2026, 1:33:54 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
910
moderation_latency
(null)
generation_time
16373
tokens_prompt
77
tokens_completion
794
native_tokens_prompt
98
native_tokens_completion
1131
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.028765
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"87b4ae0d-2af1-43ed-9431-f2c89d5648ca"
is_byok
false
latency
909
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777037634-lC8NXsrB86kJkqRHERyd"
session_id
(null)
api_type
"completions"
id
"gen-1777037634-oGadIjLSilcMvg85mdUD"
upstream_id
"87b4ae0d-2af1-43ed-9431-f2c89d5648ca"
total_cost
0.028765
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
486
0.00%
Dialogue to Total Word Ratio
Ratio: 75.56%, Deviation: 25.56%
neededClean
false
wordsTotal
491
wordsDialogue
371
34.0511%