NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-19-01KHVPC3SCP4HPZ47Z6EA53J0V
Run Details
Model:
Gemini 3.1 Pro (Preview)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
1m 22s
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
100.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unrolled the blue schematics across the hood of their recently stolen minivan. The diagrams displayed a fortified suburban backyard guarded by a distracted mother. Their target was a massive pile of brightly wrapped gifts sitting helplessly on a picnic table. Bill adjusted his black turtleneck and peered through his binoculars at the chaotic swarm of children currently terrorizing a bouncy castle. Taking toys from a baby was historically considered an amateur endeavor completely lacking prestige. However, stealing thirty unopened boxes from a heavily attended birthday bash required the utmost precision of master criminals. Molly tapped a circled zone on the physical map labeled as the primary extraction point near the petting zoo enclosure. She checked her synchronized stopwatch with professional urgency. "Did you swap the piƱata candy?" Bill pulled a thick mask over his bald head. "Yes. The sugar crash happens at three." It was entirely unnecessary for a daytime raid in the wealthy neighborhood, but he always insisted on maintaining strict industry standards. A cheaply hired magician was busy distracting the oblivious parents with mediocre illusions. This provided their perfect window of opportunity before the giant chocolate cake was finally served. "Perfect. We grab the gifts and run."
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
100.00%
Dialogue to Total Word Ratio
Ratio: 10.00%, Deviation: 0.00%
neededClean
false
wordsTotal
200
wordsDialogue
20
100.0000%