
Contents of the package:
   Required files:
      pollustop/pollustop[.exe]      - binary file
      pollustop/pollustop.sig        - signature file for integrity
      pollustop/pollustop.settings   - settings of the filter and some documentation
      pollustop/InitialSetup.pl      - Initial Setup Script sample, use with caution.
                                       You can ask Niversoft for help for initial setup

   Documentation (not needed for execution):
      pollustop/README               - basic documentation of the filter
      pollustop/HISTORY              - pollustop change log
      pollustop/TERMS                - usage terms of the product
      pollustop/INSTALL              - this file.
      pollustop/VERSION              - version number of the package and the binary.

   Database (in a separate package) - provided with pre-populated spam corpus.
   You can use it or create a new one from the spam mail you may have kept.

      pollustop/data/s.str
      pollustop/data/s.int
      pollustop/data/g.str
      pollustop/data/g.int


File installation

   Extract the contents of the package in a subdirectory of the Communigate Pro
   Base folder. I recommend creating a Filter subdirectory to put all filters
   together.

   When you get a license, put the file in the same directory.

   Non-Win32 versions of this filter requires gcc3 libraries. They are already
   installed on most systems and in the path. On SOLARIS (sparc & x86), you may
   have to add an environment variable similar to this one in the Communigate
   Pro startup script, or anywhere else suitable:
      LD_LIBRARY_PATH=/usr/local/lib
   or LD_LIBRARY_PATH=/opt/sfw/gcc-3/lib


Helper setup

   In Communigate Pro WebAdmin, Settings/General/Helpers, create a helper entry
   with a name for the filter and the path relative to the Communigate Pro base
   folder (or the full path). Set the Log Level to All Info. Put timeout
   value to one or two minutes and auto-restart to 10 seconds.

Account setup

   Pollustop has two runtime learning methods.

   "Forward" - This one consists in forwarding the message to one of the learning
               addresses: pollustop-spam and pollustop-good@yourdomain. This
               method is provided for POP users only. IMAP and Webmail users
               should use the DropBox method, it is more accurate.

   "Dropbox" - This one consists in shared mailboxes where the user can drop the
               message that he considers incorrecly filtered. You can also put a
               managed layer between the user and the real drop mailbox. The user
               puts his incorrectly fitered mail in a postmaster's shared mailbox,
               and the postmaster moves the mail from there to the real drop
               mailbox of Pollustop.

   Create an account for PolluStop. This account needs POP access for the
   "Dropbox" method. If you don't want to use the "Dropbox" method, you can set
   some router entries to avoid creating the account.

   - Account: "pollustop@servername", with POP access
   - Router entries, All-Domain Aliases:
      pollustop-spam = pollustop@servername
      pollustop-good = pollustop@servername

   All messages sent directly to PolluStop won't reach it's INBOX, and messages
   dropped in the Drop mailbox will disappear within seconds. The drop boxes are
   black holes. Messages moved into them can not be restored. Teach your users to
   copy the messages instead of moving if they want to keep the messages.

Server-Wide Rule setup

   In the server-wide Rules section, add rule, which Action is ExternalFilter,
   and Action Parameter is the name you put for the helper. You can put
   conditions you want for the execution of this filter, however, it's better to
   use the filter's settings to control the messages that has to be scanned
   (inbound only, domains, etc.)

Domain and Account-level rules

   Another rule will be needed in domain-level or account-level rules to direct
   the filtered mail to correct mailboxes. Using Domain-level rules is easier
   since the process only has to be done once per domain.

   After filtering, PolluStop adds a header to the message,
   X-PolluStop-Diagnostic: ####...

   There can be one to 10 "#" at the right of the header. The more #s, the more
   this message can be considered as spam. I recommend considering 6-# messages
   as spam.

   The rule would be:

   Header Field  is      X-PolluStop-Diagnostic: ######*
   Header Field  is not  List-Id: *
   Mark Read
   Store in    Junk Mail
   Discard

   The first rule selects all messages that have 6 # and more.

   The second one is optional and should be modified. It excludes some types of
   messages from the filtering, for instance Disucussion Lists. These types of
   messages should be excluded from filtering and from learning.

   The "Mark Read" action is optional, this is just to avoid getting "New mail
   notifications" from some email clients, for each spam message received.

   The message is then moved to the Junk Mail folder (you can set another folder
   name) and discarded from the Inbox. Not that the Junk Mail folder must exist,
   or the rule will fail and the filtered mail will end up in the INBOX.

   You can use the AddMailbox.pl script (included in the package) to create the
   "Junk Mail" mailbox on all accounts of one domain.

PolluStop settings

   You must also configure PolluStop settings. Open the file
   "pollustop.settings" which is with the binary file. All instructions and
   explanations are included in the comments of the file. It is recommended to
   leave alone the "MATHS" section, this is advanced tweaking provided for
   people who really want to get the full control. However, modification of
   these parameters is not supported by Niversoft and won't get more
   documentation than what is already in the file.


Initial Learning.

   Before using PolluStop, you must train its database with the "good" emails
   your users normally get. Training is done from the command line. The "spam"
   database is already trained with more than 4,500 spam e-mails gathered from
   different sources, but if you already have a collection of spam, you can
   delete the existing database and create yours from scratch, or even add your
   informations to the existing database. The comparative count of spam and
   good messages is not important, you can have, for instance, 4,500 spam e-
   mails, and only 1,000 good messages. Training with 1,000 to 5,000 messages
   should give good results. More training is useless and may render runtime
   training quite useless. Less training could be good to, however it has not
   been tested. In all cases, results may vary depending on the contents of
   your training set.

   Type pollustop -h to get all command-line switches and their documentation.

   Here's a summary of the ones you'll use for training:

   -l: enable training mode.

   -g path: Specifies the base directory where to find good mail, could be
            something like "-g /var/Communigate/Accounts"
   -s path: Specifies the base directory where to find spam mail, something
            like "-s /var/Communigate/Accounts/MyAccount.mdir/SPAM.mdir". Note
            that if this path is a subdirectory of a -g path, it will be
            ignored during the -g scanning.
   -i path: Specifies a directory tree to ignore.

   You can specify multiple -g, -s and -i flags for the same command.


If you have questions or inquiries regarding this filter, its installation,
or anything else, please contact info@niversoft.com.
