NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0G01D5Y6A537GRT3EZYCXS
Run Details
Model:
Claude Opus 4.7 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
16.9s
Cost:
$0.029590
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
51.93%
Matches word count
(details)
23.82%
Dialogue to Total Word Ratio
(details)
37.8738%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back booth of Dorothy's Diner smelled like burnt coffee and regret, which Molly felt was appropriate for the occasion. She slid the blueprint across the formica table. Bill squinted at it, chewing a toothpick like it owed him money. "That's a floor plan of a Chuck E. Cheese," Bill said. "It's a Pizza Palace." "It's the same thing, Molly." "It is legally and spiritually not the same thing, Bill, and if you'd read the briefing materials—" "You stapled them to a coloring book." "For camouflage." Bill sighed and flipped the blueprint over. On the back, Molly had drawn a stick figure labeled BILL holding a sack labeled LOOT. "Walk me through it," he said. "Okay. Fourteen-hundred hours, Saturday. Subject is Tyler Mackenzie Dupree, age seven, recently bar mitzvahed—" "Tyler's seven." "He's advanced. Point is, rich parents, soft defenses, approximately thirty guests, each arriving with a wrapped present in the two-hundred-dollar range." "Two hundred dollars? For a seven-year-old?" "Welcome to the Upper West Side, William." Bill whistled low. "That's six grand in LEGOs." "That's six grand in resale LEGOs. We're looking at four, maybe five, after fencing." "Who fences LEGOs?" "You'd be surprised. There's a guy in Queens named Kenny. He goes by The Brickmaster." "I hate this city." Molly leaned in. "Entry point is the ball pit. We wade in around thirteen-thirty, submerge, wait for the chaos." "I'm not getting in a ball pit." "Bill." "Molly, do you know what's in a ball pit? I saw a documentary. There's hair. There's teeth. A kid once found a live possum." "That was a viral post, not a documentary." "The possum was real to me, Molly." She pinched the bridge of her nose. "Fine. New plan. You go in as the clown." "Absolutely not." "You have the face for it." "I'm choosing to ignore that." "Bill, somebody has to distract the kids while I stuff the presents into the bouncy castle." "Why is the bouncy castle involved?" "It deflates. We zip it up. Thirty presents, one sad pile of vinyl, we wheel it out the fire exit before anyone's done singing." Bill stared at her for a long moment. The toothpick drooped. "That's actually kind of beautiful." "Thank you." "Deeply illegal. But beautiful." "I've been working on it." "What about the parents?" "Mrs. Dupree will be drinking prosecco from a sippy cup by one-fifteen. Mr. Dupree is on a call with his mistress the entire party. Trust me, I've been casing this family for three weeks." "You've been what?" "Don't worry about it." Bill pocketed the blueprint, then the coloring book, then, after some hesitation, the crayon. "One question." "Shoot." "When we get out—are we giving any of the presents back?" Molly looked genuinely wounded. "Bill. We're thieves. Not monsters." "So that's a no." "That's a firm no." He nodded slowly, stood up, and dropped a five on the table. "I'll get the clown shoes."
created_at
"2026-04-24T19:38:17.58Z"
4/24/2026, 7:38:17 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
902
moderation_latency
(null)
generation_time
16565
tokens_prompt
77
tokens_completion
804
native_tokens_prompt
98
native_tokens_completion
1164
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.02959
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"ec5291a3-970d-4162-b707-90669964e59e"
is_byok
false
latency
902
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777059497-OVlWNygcS0NrXzET10z1"
session_id
(null)
api_type
"completions"
id
"gen-1777059497-FBVDwSiJL5t0fI5T7QeP"
upstream_id
"ec5291a3-970d-4162-b707-90669964e59e"
total_cost
0.02959
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
51.93%
Matches word count
n/a
neededClean
false
words
484
23.82%
Dialogue to Total Word Ratio
Ratio: 75.36%, Deviation: 5.36%
neededClean
false
wordsTotal
491
wordsDialogue
370
37.8738%