NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-25-01KMJ7XGJ6ARHCQ6VA9V6V0TWP
Run Details
Model:
Qwen3 235B A22B Instruct 2507
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
8.8s
Cost:
$0.000321
Provider:
AtlasCloud
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
97.10%
Dialogue to Total Word Ratio
(details)
48.5517%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Moonlight glinted off the roof of the Pinwheel Party Palace, a garish dome of inflatable slides and rainbow banners flapping in the night breeze. Molly crouched behind a plastic castle turret, clutching a burlap sack large enough to contain several small ponies. Her thermal goggles revealed the enemy: two helium-filled unicorns bobbing vigilantly above the bounce house, and a sleeping clown slumped in a lawn chair near the gift table. Bill adjusted his noise-dampening oven mitts and unfurled a grappling hook made from repurposed Slinky parts and dental floss. They’d studied the layout for weeks. Kids' parties peaked at 3 p.m., but leftovers lingered until cleanup at dawn. The gifts—wrapped in dinosaur paper and glitter glue traps—were still unopened, forgotten beneath a toppled piñata shaped like a narwhal. Bill motioned with three fingers: *three minutes to rotate the night guard’s snack break*. Molly nodded, removing the batteries from a wind-up robot that had been eerily marching toward her. The air smelled of frosting and existential dread. Bill rolled a foam dodgeball toward the snack table to test security protocols. A surveillance hamster in a tiny helmet twitched its nose but remained unalarmed. They crept forward, their stealth enhanced by mismatched socks and a shared phobia of karaoke. With practiced precision, they began stuffing the sack: squirt guns, sing-a-long tablets, one very confused plush alpaca. The clown snorted in his sleep. Molly paused, holding a toy vault that played “Yakety Sax” when opened. She looked at Bill and almost smiled.
created_at
"2026-03-25T10:16:20.298Z"
3/25/2026, 10:16:20 AM
model
"qwen/qwen3-235b-a22b-07-25"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1199
moderation_latency
(null)
generation_time
8792
tokens_prompt
77
tokens_completion
407
native_tokens_prompt
72
native_tokens_completion
348
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00032064
router
(null)
provider_responses
0
endpoint_id
"aaae5602-0949-4bc9-bd77-588b4b285341"
id
"8acf8afa7fee4511ac22430353ccc221"
is_byok
false
latency
1199
model_permaslug
"qwen/qwen3-235b-a22b-07-25"
provider_name
"AtlasCloud"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1774433780-IV3fMS7KkNtq69CFZHMH"
upstream_id
"8acf8afa7fee4511ac22430353ccc221"
total_cost
0.00032064
cache_discount
(null)
upstream_inference_cost
0
provider_name
"AtlasCloud"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
251
97.10%
Dialogue to Total Word Ratio
Ratio: 5.86%, Deviation: 4.14%
neededClean
false
wordsTotal
256
wordsDialogue
15
48.5517%