NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-22-01KS8671N0MGK5Z7CKH4RKARTC
Run Details
Model:
Qwen3.7 Max
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
6m 26s
Cost:
$0.089245
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
100.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unrolled the crinkled blueprint across the hood of their rusted sedan. The suburban cul de sac was quiet, completely unaware of the criminal masterminds plotting outside the neighborhood boundaries. She tapped a greasy finger against the handwritten tactical diagrams, tracing a route past the inflatable obstacles and the sprinkler system. Bill leaned against the passenger door, nervously chewing on a plastic straw while staring at the brightly colored house down the street. He adjusted his dark sunglasses, trying to look like a hardened criminal rather than a guy who still lived with his mother. The afternoon sun beat down on them, highlighting the sheer absurdity of their elaborate planning phase. "The bouncy castle is the main choke point," Molly said. "What about the clown making balloon animals?" Bill asked. "He takes a smoke break at three. We go in through the petting zoo," Molly explained. "A petting zoo? I am highly allergic to alpacas, Molly," Bill complained. "Wear a tactical mask. We grab the gifts from the patio table," Molly whispered. Molly checked her digital watch, syncing the time with the atomic clock application on her phone. She believed precision was the hallmark of every great thief, even when the target was a backyard celebration filled with screaming toddlers and sugary snacks. Bill shifted his weight, kicking a stray pebble into the gutter. He looked at the detailed map, noting the carefully marked escape routes and the designated safe zones near the garden gnomes. "Are we really stealing from a seven year old kid?" Bill wondered. "Timmy is a spoiled brat. He gets two identical drones. We take one," Molly stated. "Just one drone? That hardly covers our gas money," Bill argued. "We are professionals, Bill. Not monsters. Leave the Lego alone," Molly insisted. "Fine. But if a goat eats my blueprint, I am quitting," Bill sighed. The sheer volume of wrapping paper inside that house represented a small fortune in retail value, assuming they could fence the electronic toys on the black market. Molly folded her arms, projecting an aura of absolute authority. She had spent three weeks observing the family, learning their routines and mapping out the hidden blind spots in their security cameras. "Focus. The pinata drops at three fifteen. That is our distraction," Molly commanded. "I hate pinatas. They remind me of my childhood," Bill muttered. "Cry later. Grab the duffel bags and watch for the juice boxes," Molly ordered. "And do not forget the grappling hook for the treehouse entry point today, sir," Bill added. Bill wiped a bead of sweat from his forehead, suddenly realizing the immense pressure of their impending operation. The neighborhood watch was notoriously strict, and getting caught by a suburban mother would be a fate worse than prison. Molly handed him a pair of black gloves, ensuring they were ready for the physical demands of the heist. She carefully checked the perimeter one last time before giving the final signal to advance toward the tall brown wooden privacy fence.
created_at
"2026-05-22T15:56:09.538Z"
5/22/2026, 3:56:09 PM
model
"qwen/qwen3.7-max-20260520"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
881
moderation_latency
(null)
generation_time
177099
tokens_prompt
77
tokens_completion
9615
native_tokens_prompt
76
native_tokens_completion
11874
native_tokens_completion_images
(null)
native_tokens_reasoning
11240
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
service_tier
(null)
usage
0.089245
router
(null)
provider_responses
0
endpoint_id
"6d30fdc6-f2a7-4bdf-b6ba-32353496d723"
id
"chatcmpl-fb45b62c-2029-98c3-b8e1-9438ef066c39"
is_byok
false
latency
881
model_permaslug
"qwen/qwen3.7-max-20260520"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779465369-SPJsv9F4OXX23mK7qe5n"
session_id
(null)
api_type
"completions"
id
"gen-1779465369-ujfhSDIjA5VN7YypYs3f"
upstream_id
"chatcmpl-fb45b62c-2029-98c3-b8e1-9438ef066c39"
total_cost
0.089245
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
100.00%
Dialogue to Total Word Ratio
Ratio: 30.00%, Deviation: 0.00%
neededClean
false
wordsTotal
500
wordsDialogue
150
100.0000%