NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0GCXT14ANAK8SVFD7YQCMD
Run Details
Model:
Claude Opus 4.7 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
14.9s
Cost:
$0.027465
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
40.6364%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back booth of Denny's smelled like burnt coffee and bad decisions, which Molly figured was appropriate given the circumstances. She'd spread a hand-drawn map across the table, weighing down the corners with a syrup dispenser, a ketchup bottle, and two half-eaten Grand Slams. Bill squinted at the diagram. A crayon-rendered bounce house dominated the northwest quadrant. "Tell me again why we're robbing an eight-year-old." "We're not robbing an eight-year-old," Molly said. "We're robbing approximately twenty-two eight-year-olds. Economy of scale, Bill. This is basic business." "The guest list is on the fridge. Kid's name is Tyler. His mom posted the whole party agenda on Facebook, including the PIN for the garage keypad, which, honestly, Bill, people deserve what they get." Bill stirred his coffee with a french fry. "What's our window?" "Pin the tail on the donkey starts at two-fifteen. That gives us a solid twelve minutes while every adult in the house is pretending to care about a blindfolded child stabbing a wall." "And the presents?" "Stacked in the dining room. Already wrapped. Already sorted. It's like they're doing our job for us." Bill frowned at a crayon squiggle that might have been a dog or possibly a threat. "What's this?" "That's Nugget. Goldendoodle. Allergic to peanuts, which is relevant because I'm bringing a Snickers." "Molly." "Relax, it's a Benadryl thing. He gets sleepy, not dead. I'm a criminal, Bill, not a monster." Bill sat back and rubbed his forehead with the heel of his hand. Three years ago he'd been a regional manager at an office supply chain. He still wasn't entirely sure how he'd gotten here, but he was pretty sure it started with a woman who ordered dry martinis at an Applebee's. "What about the clown?" Molly's eye twitched. "There is no clown." "The Facebook post said—" "I am choosing to believe there is no clown. Next question." "Okay. Exit strategy." She tapped the map with a butter knife. "Minivan parked here, behind the hedge. We load up, drive to the storage unit, spend the rest of the week fencing Nintendo Switches and those little dolls with the giant heads." "LOL Surprise." "I'm not surprised at all, actually. Kids love that garbage. Market's bulletproof." Bill studied her for a long moment. Molly had a smear of ketchup on her chin and the gleam of a woman who had committed fully to a bad idea. It was, he had to admit, a look that worked on her. "And if something goes sideways?" "Then we improvise. I brought two clown wigs, a fake mustache, and a eulogy, in case we need to crash a funeral instead." "You planned a backup funeral?" "Bill." She folded the map with the solemnity of a general. "I always plan a backup funeral." He dropped a twenty on the table. Outside, the sun was climbing, and somewhere across town, a goldendoodle was about to have the worst afternoon of his life.
created_at
"2026-04-24T19:45:19.686Z"
4/24/2026, 7:45:19 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1653
moderation_latency
(null)
generation_time
14822
tokens_prompt
77
tokens_completion
766
native_tokens_prompt
98
native_tokens_completion
1079
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.027465
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"2a75f365-154d-4933-bb2b-ca30aa562fc3"
is_byok
false
latency
1631
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777059919-oZaPEOBsAqmV5dHncddk"
session_id
(null)
api_type
"completions"
id
"gen-1777059919-NhGjgq5dueVoPiEQHQ63"
upstream_id
"2a75f365-154d-4933-bb2b-ca30aa562fc3"
total_cost
0.027465
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
488
0.00%
Dialogue to Total Word Ratio
Ratio: 54.11%, Deviation: 24.11%
neededClean
false
wordsTotal
499
wordsDialogue
270
40.6364%