NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-19-01KS0V6KC7SX8YVQC3HJ4CWED2
Run Details
Model:
Gemini 3.5 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
48.0s
Cost:
$0.111944
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
9.61%
Matches word count
(details)
40.56%
Dialogue to Total Word Ratio
(details)
25.0853%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tapped the detailed blueprint of the suburban backyard with the tip of her pencil. Bill stared at the drawing helplessly, chewing thoughtfully on a piece of string cheese. "We strike at precisely two fifteen," Molly said, leaning over the map. "The cake cutting?" Bill asked, licking his sticky fingers. "Exactly," she confirmed, tapping the paper. "That is when defensive focus is at its lowest. No one watches the gift table when sugar is being distributed." "What about the target?" "Timmy is turning six. His grandparents are loaded. We are talking Lego death stars, remote-control drones, and maybe even a retro arcade cabinet." "And who is your inside source?" Bill asked, narrowing his eyes. "The juice box distributor. He sees all, Bill." "Brilliant," Bill whispered, shaking his head in awe. "But what about the perimeter security?" "The perimeter is guarded by Mrs. Higgins. She is a hover-parent of the highest order. If she spots adults without a child, we are neutralized." "I have a countermeasure for that," Bill said, pulling open a large cardboard box. "We pose as the entertainment." Molly stared at the box, her face freezing. "Please tell me you did not rent a clown suit." "I did not rent a clown suit." "Bill." "Okay, I rented a clown suit. But it is high-grade polyester, Molly!" "We are not doing the clown routine," Molly said, rubbing her aching temples. "The last time you dressed as a clown, you got stuck in a chimney." "That was a chimney-related design flaw, not a clown issue." Molly ignored his defense and sighed. "No, we stick to the bounce house strategy," she insisted, tapping the plastic castle icon. "Explain the bounce house maneuver again." "We deflate the bounce house at two ten. That sends the entire parent coalition to the rear sector to rescue the trapped children." "Is that safe?" Bill looked genuinely worried. Molly waved a hand dismissively. "They will merely slide down slowly. It is a soft-landing scenario." "And then?" "While they are rescuing young Timmy, I slip through the side gate, bag the loot, and toss it over the wooden fence to you." "And what am I doing at the fence?" "You are the wheelman. Did you acquire the getaway vehicle?" "Yes," Bill said, jingling a shiny set of keys. "A Honda Odyssey in champagne silver. We will blend right into the suburban ecosystem. I even put a stick-figure family sticker on the rear window." "Excellent. What about the guard dog?" "A golden retriever. Extremely friendly. I have packed high-grade peanut butter to compromise his loyalty." "Smooth or chunky?" Molly asked. "Smooth. We are professionals, Molly." "Good. We cannot afford any loose ends." Bill leaned closer, frowning at the tiny drawings. "One question, Molly. What do we do if the birthday boy catches us red-handed?" "We bribe him," she replied calmly. "With what?" "The Legos." "But the Legos are the prize!" Bill cried out, clutching his chest. "A necessary sacrifice to ensure the rest of the haul. Six-year-olds are highly susceptible to bribery. Now, put on your tactical cargo pants. It is showtime." He nodded slowly, reaching for his dark mask.
created_at
"2026-05-19T19:25:30.386Z"
5/19/2026, 7:25:30 PM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2517
moderation_latency
(null)
generation_time
47834
tokens_prompt
77
tokens_completion
4678
native_tokens_prompt
67
native_tokens_completion
12427
native_tokens_completion_images
0
native_tokens_reasoning
11678
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.1119435
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"KrkMarTjILbL-8YPpa3N-AY"
is_byok
false
latency
2441
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779218730-NkS4bsL018MJARqk9z7N"
session_id
(null)
api_type
"completions"
id
"gen-1779218730-o8kSSWD2zyo6HBgZFfBz"
upstream_id
"KrkMarTjILbL-8YPpa3N-AY"
total_cost
0.1119435
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
9.61%
Matches word count
n/a
neededClean
false
words
522
40.56%
Dialogue to Total Word Ratio
Ratio: 74.25%, Deviation: 4.25%
neededClean
false
wordsTotal
532
wordsDialogue
395
25.0853%