NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-22-01KS87H4CQV1JA41E75XRGB683
Run Details
Model:
Qwen3.7 Max
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
1m 46s
Cost:
$0.057610
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
95.66%
Dialogue to Total Word Ratio
(details)
97.8226%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly carefully smoothed out the crayon drawing of the Henderson residence, tapping a red marker against the inflated bouncy castle in the backyard. "The target is the dining room," Molly said. "Specifically, the pile of gifts near the punch bowl. We go in at fifteen hundred hours, right after the piñata drops." "Right after the piñata," Bill repeated, squinting at the map. "Why not during the piñata? Everyone is distracted by the flying candy and the blindfolded kid with a baseball bat." "Because the parents record the piñata on their phones. When it breaks, there is a three minute window of pure chaos. Sugar crashes, crying, and someone inevitably gets hit with a Tootsie Roll. That is our window." Bill took a sip of his lukewarm coffee and traced the route from the driveway to the back door. He stared at the heavily fortified perimeter, which consisted mostly of plastic lawn flamingos and a coiled green garden hose. The afternoon sun beat down on their parked surveillance van, making the interior feel like a convection oven. "What if the piñata is reinforced? We could be waiting twenty minutes while a kid named Brayden bounces the bat off its papier-mâché skull." "I accounted for Brayden. His father bought the cheap donkey from the discount store. It will shatter on the third strike." "What about the clown?" Bill asked. "The clown is a mercenary. If he sees us, you distract him." "How do I distract a clown? Do I challenge him to a balloon animal duel?" "Ask him to make a poodle. Clowns hate poodles because the ears always flop. While he struggles with the latex, I slip through the patio door." Molly slid a pair of heavily tinted tactical goggles across the scratched kitchen table. The rubber straps were cracked, and the left lens had a noticeable scratch. Bill picked them up, inspecting the cheap plastic frames with profound professional disappointment. "Put these on. The bounce house reflects a lot of sunlight, and I do not want you getting snow blindness before we even reach the wrapping paper." "Molly, they are seven year olds. The security system is a baby gate and a golden retriever named Barnaby. I feel like we are overthinking this." "Barnaby is a threat. He licks intruders, which leaves DNA on our pant legs. We wear the dog treats in the tactical vest. You throw the treats, I grab the Lego Death Star." "What if the Lego Death Star is already opened?" "Then we steal the gift cards. Little Timmy gets a lot of Visa gift cards from his grandmother. They are untraceable and fit perfectly in my shoe." Bill sighed heavily, carefully folding the crayon map and slipping it into his jacket pocket. "Fine. But if I get tackled by a toddler covered in frosting, I am leaving you behind." "Deal. Now go put on your party hat. We need to blend in." "And make sure it is the one with the polka dots."
created_at
"2026-05-22T16:15:39.43Z"
5/22/2026, 4:15:39 PM
model
"qwen/qwen3.7-max-20260520"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1026
moderation_latency
(null)
generation_time
106069
tokens_prompt
77
tokens_completion
7457
native_tokens_prompt
76
native_tokens_completion
7656
native_tokens_completion_images
(null)
native_tokens_reasoning
6999
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.05761
router
(null)
provider_responses
0
endpoint_id
"6d30fdc6-f2a7-4bdf-b6ba-32353496d723"
id
"chatcmpl-bd44c24d-f86c-9bbc-a091-2b89513c0991"
is_byok
false
latency
1026
model_permaslug
"qwen/qwen3.7-max-20260520"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779466539-RTmPM0gUDfSGWJUXhB7j"
session_id
(null)
api_type
"completions"
id
"gen-1779466539-yytxkiL8MPm5QhagapCl"
upstream_id
"chatcmpl-bd44c24d-f86c-9bbc-a091-2b89513c0991"
total_cost
0.05761
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
498
95.66%
Dialogue to Total Word Ratio
Ratio: 70.94%, Deviation: 0.94%
neededClean
false
wordsTotal
499
wordsDialogue
354
97.8226%