NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KPZTYDZZPDEY9VCWV4EEPYGE
Run Details
Model:
Claude Opus 4.7
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
15.2s
Cost:
$0.029315
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfurled the blueprint across the kitchen table, weighing down the corners with two mugs, a stapler, and a half-eaten jar of peanut butter. Bill leaned in, squinting like a man who had forgotten his glasses but refused to admit it. "Okay," Molly said, tapping the paper. "The target is a bouncy castle, a piƱata, and approximately thirty-seven wrapped gifts." "Thirty-seven?" Bill asked. "That's oddly specific." "I counted the RSVPs on the fridge." "You broke into their fridge already?" "I was thirsty," Molly said. "Focus. The birthday girl turns seven on Saturday. Her name is Penelope. She likes unicorns, glitter, and, according to the invitation, 'reasonable conversation.'" "She sounds terrifying." "She is the mark, Bill. Don't humanize her." Bill scratched his stubble. "What's the security situation?" "Two parents, one grandmother, and a clown named Bobo." "Bobo's a problem," Bill said gravely. "I worked a gig in Tulsa with a Bobo. Man could juggle knives with his feet." "This is a different Bobo." "How do you know?" "Because that Bobo is in prison, Bill. We put him there." "Right. Right." Bill nodded, rubbing his temple. "So what's the play?" "We go in disguised as entertainers. I'm the magician. You're the assistant." "Why am I the assistant?" "Because last time you did magic, you set a dove on fire." "The dove was fine." "The dove was *medium*, Bill." He muttered something about artistic differences and stabbed a finger at the blueprint. "And the presents?" "Stacked on the gift table in the living room. While I'm pulling scarves out of my sleeve, you load them into the bottomless trunk." "I love the bottomless trunk." "Everyone loves the bottomless trunk." "What about the kids?" Molly looked up, concerned. "What about them?" "Well, thirty kids are gonna notice when the birthday loot vanishes." "That's where the finale comes in." She produced a small remote with a single red button. "I press this, confetti cannons detonate, smoke machine kicks in, and we exit through the patio. By the time the smoke clears, we're in the van eating sheet cake." "You're bringing the cake?" "I am absolutely bringing the cake." Bill whistled, impressed. "And the parents?" "Distracted. The dad's already three mimosas deep by noon, according to the neighbor I chatted up at the mailbox." "You chatted up a neighbor?" "Her name is Denise. She hates Penelope's mother and gave me the Wi-Fi password." "Molly, you are a magnificent human being." "I know." She rolled up the blueprint. "Saturday. Eleven sharp. Wear something sparkly but nonthreatening." "Define nonthreatening." "No sequined skulls this time, Bill." "That was one birthday party." "It was a baptism." Bill sighed, slipping the remote into his coat. Molly shouldered her bag, already humming a little tune that sounded suspiciously like *Happy Birthday* in a minor key. "One last thing," she said at the door. "If anyone offers you a juice box, take it. We're supposed to blend in."
created_at
"2026-04-24T13:30:24.654Z"
4/24/2026, 1:30:24 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
753
moderation_latency
(null)
generation_time
15089
tokens_prompt
77
tokens_completion
815
native_tokens_prompt
98
native_tokens_completion
1153
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.029315
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"51e7d9cc-f7e3-4839-8bf9-38c56fdc6502"
is_byok
false
latency
753
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777037424-VsbnktDH18Lh9ClvlJZ5"
session_id
(null)
api_type
"completions"
id
"gen-1777037424-QfGLLnZdOgpaSXcBL95R"
upstream_id
"51e7d9cc-f7e3-4839-8bf9-38c56fdc6502"
total_cost
0.029315
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
0.00%
Dialogue to Total Word Ratio
Ratio: 73.42%, Deviation: 23.42%
neededClean
false
wordsTotal
489
wordsDialogue
359
30.1376%