NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ03D1J2FJQEQ4NDBVW5JRAQ
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
6.8s
Cost:
$0.001346
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
97.09%
Dialogue to Total Word Ratio
(details)
98.2323%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouch behind the punch bowl, scheming. Molly: "Okay, Bill, the birthday cake is the perfect distraction." Bill: "Got it, I’ll juggle the clown balloons while you slip in." Molly: "Sneak past the piñata, grab the gift table, then vanish." Bill: "What about the kids? They’ll scream." Molly: "We’ll wear clown masks, they’ll think it’s part of the show." Bill: "Brilliant, and the sugar rush will keep them busy." Molly: "Timing is key—mid‑song, when the magician says ‘Abracadabra.’" Bill: "I’ll set off confetti cannons, then we’ll tip‑toe to the presents." Molly: "Don’t forget the Mom’s handbag—her credit cards are the real loot." Bill: "Got it, I’ll pocket those before the cake is cut." Molly: "Remember, no glitter left behind—clean up is optional." Bill: "If we’re caught, we’ll blame the clown." Molly: "Perfect, the kids love clowns, the parents fear them." Bill: "What’s our exit strategy?" Molly: "Slide down the slide, then dash through the backyard gate." Bill: "And the getaway car is parked at the neighbor’s birthday cake stand." Molly: "Great, I’ll bring the sack, you bring the grin." Bill: "Ready when you are, partner in crime." Molly: "On three—one, two, three—let's raid the party!" Bill: "Heist time!"
created_at
"2026-04-24T15:58:12.059Z"
4/24/2026, 3:58:12 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6080
moderation_latency
(null)
generation_time
6667
tokens_prompt
77
tokens_completion
2139
native_tokens_prompt
127
native_tokens_completion
2212
native_tokens_completion_images
(null)
native_tokens_reasoning
1785
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
usage
0.00134625
router
(null)
provider_responses
0
endpoint_id
"b3f80c05-e31a-44f6-b63f-414467b87553"
id
"447e71cb-65c1-4185-89d0-ffe79902039d"
is_byok
false
latency
310
model_permaslug
"openai/gpt-oss-120b"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777046292-vDXJbPJmhgukq7aYoxaf"
session_id
(null)
api_type
"completions"
id
"gen-1777046292-0qh7rJjFMEylSEXMJBSm"
upstream_id
"447e71cb-65c1-4185-89d0-ffe79902039d"
total_cost
0.00134625
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
205
97.09%
Dialogue to Total Word Ratio
Ratio: 85.85%, Deviation: 4.15%
neededClean
false
wordsTotal
205
wordsDialogue
176
98.2323%