NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ075Q23ZJ0DBZZ8NN568W7D
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
1m 24s
Cost:
$0.000856
Provider:
Phala
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8722%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The dim attic smelled of stale pizza; a flickering bulb cast shadows as Molly and Bill whispered conspiratorially about gifts. Molly: Okay, Bill, the birthday bash starts at seven, kids will be distracted by cake and clowns, perfect for us. Bill: Got it, Molly. I’ll pose as the pastry chef, sprinkle sugar, then tiptoe to the gift table for fun. Molly: Brilliant, but remember the piñata—once it explodes, chaos will cover our escape; we must time it right like clockwork. Bill: I’ll bring a remote‑controlled drone to buzz the balloons; kids will think it’s a magical unicorn that floats upward. Molly: And the presents—stacked in a pile, wrapped in glitter; we’ll load them into the giant clown car for quick. Bill: Quick, yes, but we need a distraction—maybe release a herd of rubber chickens from the hallway to confuse parents. Molly: Parents will chase squeaking birds, giving us minutes to scoop up every toy, candy, even the birthday hat today. Bill: After we’re out, we’ll split the loot at the warehouse, pizza slices for each, of course with extra cheese. Molly: Deal! If anything goes wrong, we blame the clown—he’s already got a painted nose and a shady alibi today.
created_at
"2026-04-24T17:04:06.223Z"
4/24/2026, 5:04:06 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
7467
moderation_latency
(null)
generation_time
83411
tokens_prompt
77
tokens_completion
1611
native_tokens_prompt
129
native_tokens_completion
1720
native_tokens_completion_images
(null)
native_tokens_reasoning
1295
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0008557
router
(null)
provider_responses
0
endpoint_id
"909fdbc8-1b02-46ea-a9fc-cc3cfe65d992"
id
"chatcmpl-9f61e53ff23ddd97"
is_byok
false
latency
7404
model_permaslug
"openai/gpt-oss-120b"
provider_name
"Phala"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777050246-MyBHUQxNEWAZSeK6UxOk"
session_id
(null)
api_type
"completions"
id
"gen-1777050246-7bIiT5UoMzPSUebh4afz"
upstream_id
"chatcmpl-9f61e53ff23ddd97"
total_cost
0.0008557
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Phala"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
204
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 90.00%
neededClean
false
wordsTotal
204
wordsDialogue
0
49.8722%