NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0GE9ZV54HZ5ZYQA863Z5CK
Run Details
Model:
Claude Opus 4.7 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
15.2s
Cost:
$0.027315
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
0.21%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.1071%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back booth of Denny's had seen better plans. It had also seen better pancakes, but Molly wasn't here for the pancakes. She was here because Bill had called her at 6 a.m. with what he described as "the score of a lifetime," which, knowing Bill, meant either a bank vault or a vending machine. It was, she discovered, neither. "A birthday party," she said flatly. Bill slid a crayon drawing across the table. It appeared to be a blueprint, rendered in Burnt Sienna on the back of a kids' menu. A lopsided rectangle marked HOUSE. A smaller rectangle marked LOOT PILE. A stick figure in a cone hat marked, presumably, the enemy. "Not just any birthday party," Bill said. "Tanner Willoughby is turning seven." "And?" "And his dad owns three car dealerships. Kid's getting a Nintendo Switch, a drone, and, according to my source, an actual pony." Molly stirred her coffee. "Your source." "The clown." "You have a clown informant." "Marcelo owes me four hundred dollars and his dignity." She exhaled through her nose, which was the closest Molly came to laughing before noon. The drawing really was terrible. There was a dog labeled DOG, and next to it, an arrow pointing to a smaller dog labeled ALSO DOG. "Why are there two dogs?" "There's two dogs." "Bill." "We go in during the piñata," he said, tapping the menu with authority. "Twelve kids swinging a bat blindfolded, nobody's watching the gift table. Marcelo makes balloon animals, I come in the side gate dressed as Uncle Kevin—" "Who is Uncle Kevin?" "Every family has an Uncle Kevin. Nobody questions Uncle Kevin." He had a point, which was the most alarming thing about Bill. Half his ideas were arrests waiting to happen, and the other half were, irritatingly, correct. "What's my role?" she asked. "Minivan. You pull up, hazards on, say you're the Instacart lady. I toss the presents in the back, we go. Twenty-minute job. Two-hundred-grand retail, minimum." "Retail on a pony is not two hundred grand." "Fine. Ponyless, it's still six figures if we flip the electronics on Facebook Marketplace." Molly leaned back. Across the restaurant, a toddler was systematically destroying a basket of creamer cups. She watched him with the professional respect of one artist for another. "The parents will be there," she said. "Other parents. Grandparents. At least one uncle who was a cop." "That's why we go in smiling. Nobody suspects a guy holding a casserole." "You're bringing a casserole." "I'm bringing the casserole. The casserole is the alibi." She looked down at the crayon blueprint again. At the LOOT PILE. At the ALSO DOG. At her entire adult life, really, which had led, inexorably, to this sticky vinyl booth. "Fine," Molly said. "But if there's a bounce house, we're taking that too." Bill grinned. "That's my girl."
created_at
"2026-04-24T19:46:04.927Z"
4/24/2026, 7:46:04 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
670
moderation_latency
(null)
generation_time
15136
tokens_prompt
77
tokens_completion
750
native_tokens_prompt
98
native_tokens_completion
1073
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.027315
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"e0051b41-1c7a-4813-9895-0243b4e3ada0"
is_byok
false
latency
669
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777059964-oi9JycypKWMA407ZolLI"
session_id
(null)
api_type
"completions"
id
"gen-1777059964-fhobgyYxJ6v4S5SG59Bb"
upstream_id
"e0051b41-1c7a-4813-9895-0243b4e3ada0"
total_cost
0.027315
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.21%
Matches word count
n/a
neededClean
false
words
472
0.00%
Dialogue to Total Word Ratio
Ratio: 47.37%, Deviation: 17.37%
neededClean
false
wordsTotal
475
wordsDialogue
225
0.1071%