NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-25-01KMJ81PYF9FR1NPNMAF5QV6K4
Run Details
Model:
Qwen 3 32B
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
30.2s
Cost:
$0.000420
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.16%
Dialogue to Total Word Ratio
(details)
0.0790%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind a towering inflatable castle, adjusting her clown nose while eyeing the room. Balloons bobbed like drunken ghosts above a cluster of birthday cakes, and a disco ball cast suspicious glints over the horde of shrieking tots. Bill, dressed as a caricature of a party clown with a lopsided afro, fidgeted with a remote control for the "professional" balloon artist he’d rented from a shady online vendor. “You sure the parents won’t miss all this stuff?” he muttered, gesturing to the mound of unwrapped gifts beneath the snack table. “They’ll be too busy arguing over whose kid ate the last cupcake,” Molly said, peeling off her false eyelashes and replacing them with a more subtle set. She’d stolen them from a cosmetics counter last week—*and now they’re stealing back*, she thought grimly. Bill tripped over a deflating zebra piñata, spilling confetti like a scene from a warzone. “Why do we need *two* people to rob a party? What if someone notices us?” “Someone’s always noticing,” Molly grunted, hauling him upright. “Just don’t blow anything up. Kids don’t handle explosions well.” As if on cue, a child in a dinosaur costume slammed into Bill, sending him toppling into a glitter cannon. The device detonated, showering the room with iridescent specks. Molly facepalmed while Bill babbled, “I thought that was activated by *motion*! Not *madness*!” When the chaos died down, they debated whether to hide the loot in the bounce house or the piñata. “The bounce house is too obvious,” Molly said. “Piñata’s already been hit. What’s the harm in one more body?” Bill stared at her, covered in cake frosting and existential dread. “Why do I get the feeling we’re the *present* here?” Unbeknownst to them, a toddler had tied their getaway rope to a helium tank. As they inched toward the door, the tank tugged free, dragging Molly upward by her hair. Bill, overcome by panic and the scent of punch(flavor: mystery), yelled, “I’M SORRY FOR EVERYTHING!” and tackled her into the nearest cake, swallowing their plans in a merge of strawberry filling and humiliation.
created_at
"2026-03-25T10:18:37.908Z"
3/25/2026, 10:18:37 AM
model
"qwen/qwen3-32b-04-28"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
648
moderation_latency
(null)
generation_time
30128
tokens_prompt
93
tokens_completion
1058
native_tokens_prompt
73
native_tokens_completion
916
native_tokens_completion_images
(null)
native_tokens_reasoning
494
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0004195
router
(null)
provider_responses
0
endpoint_id
"d342f1c5-34b5-4a5c-b2ea-a0cdcbf72e70"
id
"028afdbf-a54c-40fd-bf21-b8406607688b"
is_byok
false
latency
647
model_permaslug
"qwen/qwen3-32b-04-28"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1774433917-IiHg13Kl85mMAB5sZhWC"
upstream_id
"028afdbf-a54c-40fd-bf21-b8406607688b"
total_cost
0.0004195
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
347
0.16%
Dialogue to Total Word Ratio
Ratio: 25.94%, Deviation: 15.94%
neededClean
false
wordsTotal
347
wordsDialogue
90
0.0790%