NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-25-01KMJ8JSRJFCD45HZZWCVWC17P
Run Details
Model:
Qwen3 235B A22B Instruct 2507
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
20.3s
Cost:
$0.000468
Provider:
Novita
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The attic above Tony’s Tiny Tacos smelled of old menus, desperation, and the faint ghost of deep-fried sadness. Dust motes danced in the single shaft of moonlight that managed to slice through a crack in the roof. Molly crouched beside a stack of expired coupons, adjusting her black ski mask, which she had ironically bedazzled with tiny pink rhinestones. Bill tugged at his own mask, which read “#1 GRANDPA” in cheerful red font. “I still don’t understand why it has to be a *children's party*,” he whispered, peering through a hole in the wall. Below, a bounce house trembled with the joyful ferocity of a sugar-fueled earthquake. Streamers flapped like surrender flags. “Because,” Molly said, rolling out a blueprint on the floor — a crayon drawing she had stolen from a 7-year-old’s backpack, “every kid gets at least three presents. Minimum. We're talking 45 children, Bill. That’s 135 items. At an average retail value of $18.50, not including gift cards?” “You factored in gift cards?” “Naturally. And emotional devastation. But that’s not liquid.” “So we’re stealing toys... and candy... and possibly a glitter bomb that goes off when someone opens a box?” “That's the risk you take in high-stakes gift larceny,” Molly said, snapping on fingerless gloves. “Besides, the clown is our ally.” “The clown?!” “His name is Bubbles. We went to clown college together. Long story involving a misplaced whoopee cushion and a restraining order.” “You’re telling me the weeping, grease-painted man currently getting elbowed in the groin by a child in a dinosaur costume is on our side?” “He’s a method performer. Plays the tragedy. Keeps the adults distracted. Watch.” Molly raised a walkie-talkie the size of a toaster. “Bubbles, status.” A crackle. A sob. “*She kicked me... and called me a fake Santa.* Mission compromised. Requesting backup emotional support and possibly Advil.” Molly lowered the walkie. “See? Distracted.” Bill groaned. “We’re stealing presents from kids. What kind of monsters are we?” “The entrepreneurial kind,” Molly said, pulling out a grappling hook shaped like a candy cane. “Besides, these gifts weren’t even picked out by the kids. Their parents grabbed whatever was on sale. This isn’t theft. It’s redistribution with flair.” “The bounce house is right under the window,” Bill said, eyeing the flimsy structure. “We could slide down into it, grab the gift pile near the snack table—” “No,” Molly whispered sharply. “Too many witnesses. We go in through the cake.” “The cake?” “A three-tier monstrosity topped with plastic dinosaurs. I’ve already hollowed out the middle during reconnaissance, disguised as a concerned parent with a nut allergy.” “You hollowed out a birthday cake?" “It was mostly frosting and lies. Perfect for concealment. We enter through the cake, neutralize the gift mountain, exit via the piñata drop zone. Bubbles creates a distraction involving interpretive dance and possibly a kazoo solo.” “And if we get caught?” “Then we tell them the truth,” Molly said, pulling out a tiny Santa hat and placing it on her head. “We’re *real* Santa’s subcontractors. Overworked, underpaid, and frankly tired of the whole ‘good kid’ algorithm.” Bill stared. “You’re insane.” “And you’re wearing a 'World’s Okayest Grandpa' apron under your tactical vest.” “That’s for disguises!” “Exactly. Now put on your reindeer antlers and let’s jingle all the way to financial freedom.”
created_at
"2026-03-25T10:27:57.846Z"
3/25/2026, 10:27:57 AM
model
"qwen/qwen3-235b-a22b-07-25"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
969
moderation_latency
(null)
generation_time
20320
tokens_prompt
77
tokens_completion
969
native_tokens_prompt
72
native_tokens_completion
795
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00046758
router
(null)
provider_responses
0
endpoint_id
"cf186489-d252-4fb7-aca9-87ef7b557eff"
id
"f727bce20b15ae67c9d3077980c20a6d"
is_byok
false
latency
969
model_permaslug
"qwen/qwen3-235b-a22b-07-25"
provider_name
"Novita"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1774434477-GHr3FKIpOseyVKlJfJ02"
upstream_id
"f727bce20b15ae67c9d3077980c20a6d"
total_cost
0.00046758
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
546
0.00%
Dialogue to Total Word Ratio
Ratio: 67.03%, Deviation: 17.03%
neededClean
false
wordsTotal
552
wordsDialogue
370
0.0000%