NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KPZTZ1Z5Y4RYDB0S2G0KAC7K
Run Details
Model:
Claude Opus 4.7
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
17.3s
Cost:
$0.029490
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
2.91%
Dialogue to Total Word Ratio
(details)
39.0320%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unrolled the blueprint across the hood of the minivan, smoothing it with the reverence of a surgeon preparing for open-heart surgery. Bill squinted at it, chewing the end of a glow stick. "Is that a Chuck E. Cheese floor plan?" "It's a Pizza Palace floor plan, Bill. There's a difference." "What's the difference?" "About forty cents a slice and a rat with lower self-esteem. Focus." Bill straightened. "Okay. Walk me through it." "The party's in the Birthday Bonanza Room. Capacity: twenty screaming children, six exhausted parents, and one clown named Giggles who, according to my sources, has a warrant out in three states." "Sources?" "Yelp reviews. Anyway, presents are stacked here—" she tapped the map, "—on the gift table. We go in at fourteen-hundred hours, right between the piñata and the cake." "Why then?" "Because, Bill, that's the window of maximum chaos. Sugar's hit. Parents are distracted. One kid is always crying. Another is always throwing up. It's the perfect fog of war." "I love it when you talk tactical." "Don't." Bill scratched his chin. "What's my role?" "You're Uncle Ron." "Who's Uncle Ron?" "Nobody. That's the point. Every kid's party has an Uncle Ron nobody remembers inviting. He stands by the snacks, eats too much ranch, and nobody questions him." "I can do that. I eat a lot of ranch organically." "I'm aware." "And you?" Molly produced a sequined visor from her jacket. "I'm the entertainment. Sparkle the Balloon Lady." "You don't know how to make balloon animals." "I know how to make one balloon animal." "Which one?" "A snake." "Molly, a snake is just a balloon." "It's a long balloon, Bill. Children are stupid. They'll love it." "And while you're twisting your snake—" "Please don't say it like that." "—I'm loading presents into the duffel?" "Into the bouncy castle." Bill blinked. "Into the what now?" "The bouncy castle deflates at fifteen-hundred. We've paid off the rental guy. Fifty bucks and a promise we'd never tell his wife about the woman in Reno." "How do you know about the woman in Reno?" "I don't. But he doesn't know that." "You beautiful monster." "The castle collapses around the loot. Rental guy rolls it up, tosses it in his truck, drives to the rendezvous. We meet him behind the Arby's." "Why Arby's?" "Because nobody ever witnesses anything at an Arby's, Bill. It's a legal gray zone." "And the kid? The birthday kid?" Molly's face softened for half a second, then hardened like a Jolly Rancher left in a car. "Collateral sadness." "That's cold." "He's turning eight. He's getting a PlayStation, a drone, and what looked like actual gold from the grandparents. He'll survive. Builds character." "You had a rough childhood, didn't you?" "I had birthday parties at a Pizza Palace, Bill. What do you think?" Bill nodded solemnly and tucked the glow stick behind his ear. "For the children." "For the children's *presents*." "Right. That."
created_at
"2026-04-24T13:30:45.107Z"
4/24/2026, 1:30:45 PM
model
"anthropic/claude-4.7-opus-20260416"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
897
moderation_latency
(null)
generation_time
17230
tokens_prompt
77
tokens_completion
813
native_tokens_prompt
98
native_tokens_completion
1160
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.02949
router
(null)
provider_responses
0
endpoint_id
"0fa65117-00d9-4835-808d-9b68041d9ce4"
id
"7b057563-42fb-4212-a465-5937e28082c5"
is_byok
false
latency
896
model_permaslug
"anthropic/claude-4.7-opus-20260416"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777037445-40guY49cjVrzsqkVHsxc"
session_id
(null)
api_type
"completions"
id
"gen-1777037445-0y3kZuZwaBzmFm7SAwyz"
upstream_id
"7b057563-42fb-4212-a465-5937e28082c5"
total_cost
0.02949
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
487
2.91%
Dialogue to Total Word Ratio
Ratio: 78.41%, Deviation: 8.41%
neededClean
false
wordsTotal
491
wordsDialogue
385
39.0320%