NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0AZMWJXCMFN6QJQXDG3ZYD
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
6m 21s
Cost:
$0.001405
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
100.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The sun had barely risen, but Molly and Bill were already crouched behind the inflatable castle, whispering like squirrels on a caffeine binge. "We need the big red balloon to distract the kids, while you sneak to the gift table." Molly giggled, eyeing the towering cake. "I'll use the clown nose as a distraction device, then grab the shiny toys." The plan sounded as coherent as a jellybean philosophy lecture. "Don't forget the pie —if they see it, they'll think it's free dessert and leave the room." Molly tapped the clipboard, pretending to be a party organizer. "I'll announce a balloon pop contest, draw the kids outside, you sneak in, grab everything, including the puppet show props." Bill smirked, polishing his goofball mask. "What about the kid nap corner? We could hide there, pretend to be lost clowns, then make a grand exit with the loot." They both laughed, the sound echoing like squeaky shoes on a marble floor. "Ready to waltz through balloons and steal the slime?" As sunrise painted the sky with pastel candy colors, Molly and Bill slipped through the party tent, pockets bulging, while confused children chased a runaway clown nose, oblivious to the perfect heist.
created_at
"2026-04-24T18:10:41.694Z"
4/24/2026, 6:10:41 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
450
moderation_latency
(null)
generation_time
381354
tokens_prompt
77
tokens_completion
5913
native_tokens_prompt
129
native_tokens_completion
5595
native_tokens_completion_images
(null)
native_tokens_reasoning
5615
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0014052
router
(null)
provider_responses
0
endpoint_id
"3dc95f88-3960-4c91-b2bb-bbd2f5069c48"
id
"bfa90cc677e1d2d509749f4bc7092f86"
is_byok
false
latency
389
model_permaslug
"openai/gpt-oss-120b"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777054241-zl0r9SBpIf0ZLxt0WXlA"
session_id
(null)
api_type
"completions"
id
"gen-1777054241-ZnsYHLYLNvNjzYzc4Hc0"
upstream_id
"bfa90cc677e1d2d509749f4bc7092f86"
total_cost
0.0014052
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
100.00%
Dialogue to Total Word Ratio
Ratio: 50.00%, Deviation: 0.00%
neededClean
false
wordsTotal
200
wordsDialogue
100
100.0000%