NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0GEPCH9640DH95EXCYY3EH
Run Details
Model:
Claude Opus 4.7 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
15.4s
Cost:
$0.028590
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
36.30%
Dialogue to Total Word Ratio
(details)
68.1487%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the sticky diner table and weighed down the corners with ketchup bottles. Bill slurped his milkshake with the focus of a man defusing a bomb. "Stop that," she said. "Stop what?" "The slurping. You sound like a bathtub draining." Bill set the milkshake down with wounded dignity. "I slurp when I strategize. It's my process." "Your process is disgusting. Now look. The Pemberton kid turns seven on Saturday. Bouncy castle here, piñata here, and a pile of presents the size of a Volkswagen right here by the gazebo." "How many kids we talking?" "Twenty-two confirmed RSVPs." Bill whistled. "Twenty-two little goblins. Molly, that's a lot of sticky hands." "Which is why we go in disguised." "As what, exorcists?" "As entertainers." She slid a flyer across the table. "Giggles and Wiggles, the Party Clown Duo. I already called and canceled them. Told the mom I had pinkeye." Bill stared at the flyer. "You want me to be a clown." "I want you to be Wiggles." "Why am I Wiggles? Why aren't you Wiggles?" "Because I'm Giggles. Giggles is the brains. Wiggles does balloon animals." "I don't know how to make balloon animals, Molly." "Everything's a snake, Bill. If they ask for a giraffe, it's a tall snake. If they ask for a dog, it's a short snake. Nobody's going to fact-check a clown." Bill rubbed his temples. "And how exactly are we getting the presents out? We can't just waltz off with a Volkswagen of loot while juggling." "That's where the magic trick comes in." Molly tapped a big red X on the blueprint. "At three o'clock sharp, I announce the grand finale. The Disappearing Treasure Chest." "We don't have a treasure chest." "We have a U-Haul." "Molly." "Painted like a treasure chest." "Molly, a U-Haul is not going to fit in a backyard." "It doesn't need to fit in the backyard, genius, it needs to fit in the driveway. I tell the kids to close their eyes and count to a hundred. You load the presents through the side gate. Boom. Magic." Bill chewed his straw. "Kids can't count to a hundred." "Then they'll count to ten eight times. I don't care. The point is their eyes are closed." "And the parents?" "Drunk. It's a Saturday. Have you ever been to a kid's birthday party? The adults are three mimosas deep before the candles are lit." Bill considered this. It had the terrible, shimmering logic of all Molly's plans—the kind that sounded airtight right up until someone was running from a Labrador with a balloon sword in their teeth. "What about the cake?" he asked. "What about it?" "Are we stealing the cake?" "Bill, we are professionals." "So that's a yes." Molly smiled and rolled up the blueprints. "That's a maybe. Depends on the flavor." She stood, dropped a twenty on the table, and clapped him on the shoulder. "Get some sleep, Wiggles. Tomorrow we buy a U-Haul and some very large shoes."
created_at
"2026-04-24T19:46:17.621Z"
4/24/2026, 7:46:17 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
685
moderation_latency
(null)
generation_time
15338
tokens_prompt
77
tokens_completion
777
native_tokens_prompt
98
native_tokens_completion
1124
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.02859
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"37e20544-ec2b-475c-906e-b7a7da3dad02"
is_byok
false
latency
685
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777059977-FfVVZS294rlVzgv4Xabr"
session_id
(null)
api_type
"completions"
id
"gen-1777059977-g37o67Lj2fkGVOr3RfbI"
upstream_id
"37e20544-ec2b-475c-906e-b7a7da3dad02"
total_cost
0.02859
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
499
36.30%
Dialogue to Total Word Ratio
Ratio: 74.50%, Deviation: 4.50%
neededClean
false
wordsTotal
502
wordsDialogue
374
68.1487%