Tagging Spam E-Mail
with SpamAssassin™

Overview

Spam, unsolicited electronic mail typically of a commercial and/or pornographic nature, has become a major issue over the last few years. In order to help combat the spread of spam, Library and Technology Services has enabled spam tagging using SpamAssassin. By default, SpamAssassin uses a pre-configured set of rules to determine if your incoming mail contains spam. These preferences can be over-ridden to suit your own individual email filtering requirements.

Below is an example of how spam normally appears in your mailbox. Note the addition of “[SPAM]” before the original subject description.

A preview of the e-mail displays text indicating that the received message was tagged as spam, and gives an explanation of why. The original message is saved as an attachment.

spam_msg

SpamAssassin, by itself, will not delete spam mail. It simply determines the likelihood that incoming mail is spam and marks it prior to arriving in your mail client. Most mail programs do not have a very sophisticated way to make this determination, thus the need for SpamAssassin to scan your mail. You need to configure your mail program to use a set of “filters” or “rules” to determine what action to take if spam is detected. Many people choose to re-direct spam to go to a special folder for later review. By combining SpamAssassin and your mail client’s rules, you will have very thorough control of spam.

NOTE: LTS do not recommend sending spam mail directly to the trash folder, as sometimes legitimate mail can be accidentally identified as spam. Instead, it is recommend that you set up a folder specifically for spam, and periodically check its contents. Consult your mail program documentation for information about how to create folders and use filters.

Generally speaking, SpamAssassin uses two methods to determine if mail is spam:

Whitelist / Blacklist – Specifies lists of addresses for which mail should always be considered acceptable mail (whitelist) or always be considered spam mail (blacklist).

Content Test Scoring – Analyzes the mail header and body contents and assigns a weighted score on its findings. If the cumulative test score exceeds a threshold level, the mail is flagged as spam.

This document covers only the basic configuration and use of SpamAssassin. For complete configuration details see the document found at: http://ns2.cc.lehigh.edu/spamassassin-conf.html, and the SpamAssassin built-in test list found at: http://spamassassin.org/tests.html.


Configuring Mail Filters

Instructions for configuring spam mail filters for supported email programs can be found at the following locations:

Filtering Spam with Netscape 4.7x (for PC's or Mac; graphics may vary)
Filtering Spam with Mozilla 1.1x / Netscape 7 (for PC's or Mac; graphics may vary)
Filtering Spam with IMP


Whitelist and Blacklist

A whitelist is used to specify email addresses which send you mail that is often tagged incorrectly as spam. By including an address or domain in the whitelist, you indicate that all mail from that sender is acceptable, and should never be labeled as spam.

EXAMPLE 1:
Mail from a colleague keeps getting flagged as spam. To prevent any mail from him being marked as spam, add the following line to your SpamAssassin configuration file:

whitelist_from myfriend@university.edu

EXAMPLE 2:
Because of the content, a weekly newsletter you subscribed to from newsletter.com is always flagged as spam, when you actually want to receive it as legitimate email.

whitelist_from *@newsletter.com

A blacklist is exactly the opposite of a white list. It specifies email addresses that should, under all circumstances, be flagged as spam.

EXAMPLE 1:
A former co-worker, myoldfriend@hotmail.com, keeps sending you unwanted messages, but you have not been able to get him to stop sending them to you. To mark all mail from this user as spam, add the following line to your SpamAssassin configuration file:

blacklist_from myoldfriend@hotmail.com

EXAMPLE 2:
You visited a web site www.sendmorespam.org, and now you keep getting unsolicited e-mail from numerous people at that site. To mark mail from any user at this site as spam, add the following line to your SpamAssassin configuration file:

blacklist_from *@sendmorespam.org


Specifying List Addresses

You can specify addresses in a number of ways, incuding using wildcards:

An individual: friend@somewhere.com - Mail from friend@somewhere.com

An entire domain: *@isp.com - Mail sent from any address at isp.com

NOTE: Only the * (many characters) and % (single character) wildcard characters are supported in SpamAssassin.

Weighted Test Scoring

Incoming mail is processed by SpamAssassin using a set of built-in tests. Each test adds its score to a running total for individual mail messages. If the overall score exceeds the default value of 5 points, SpamAssassin considers it spam. Generally speaking, the default settings for SpamAssassin do a pretty good job of knowing what is and isn’t spam.

The SpamAssassin configuration file uses the following format to configure or modify weighted scoring:

TEST_NAME n.nn

The value of TEST_NAME is one of close to 1,000 unique tests configurable in the SpamAssassin software. A description of the tests and their default point value are listed on the SpamAssassin web site at: http://spamassassin.org/tests.html. Although many of these tests are cryptic, there are some basic ones that you may want to configure.

EXAMPLE:
You receive a great deal of unsolicited offers to receive FREE items from many different vendors. You don’t want any of them. SpamAssassin can look for the word “FREE” in the message header to assign a high value to this type of mail.

The test SUB_FREE_OFFER looks in the header, and if it begins with the word “Free,” it assigns a default point value of 0.339 to the message. If you wanted ALL email that started with “Free” in the subject line to be flagged as spam, you could increase the default value to raise it above the 5 point limit using the following line:

SUB_FREE_OFFER 10.0


Configuring SpamAssassin at Lehigh

By default, spam flagging is enabled on all Lehigh mail accounts. To modify your SpamAssassin configuration, go to the Lehigh Account Page and choose “Configure SPAM Tagging” under the “Electronic Mail Functions.” The text below shows the default configuration profile for Lehigh Accounts.

# SpamAssassin user preferences file. See 'perldoc Mail::SpamAssassin::Conf'
# for details of what can be tweaked.
###########################################################################

# How many hits before a mail is considered spam.
# required_hits 5

# Whitelist and blacklist addresses are now file-glob-style patterns, so
# "friend@somewhere.com", "*@isp.com", or "*.domain.net" will all work.
#
# whitelist_from someone@somewhere.com
# blacklist_from someone@somewhere.com


# Uncomment "report_safe 0" below and incoming spam will only be modified
# by adding some headers, and no changes will be made to the body.
# report_safe 0


# Speakers of Asian languages, like Chinese, Japanese and Korean, will almost
# definitely want to uncomment the following lines. They will switch off some
# rules that detect 8-bit characters, which commonly trigger on mails using CJK
# character sets, or that assume a western-style charset is in use.
#
# score HEADER_8BITS 0
# score HTML_COMMENT_8BITS 0
# score SUBJ_FULL_OF_8BITS 0
# score UPPERCASE_25_50 0
# score UPPERCASE_50_75 0
# score UPPERCASE_75_100 0

You will notice that this default configuration file has all lines disabled with the comment mark “#”. The lines in the default profile are simply to give you a guide to what the profile should look like as well as some typical user preferences. By un-commenting lines or adding your own settings, you can change the way SpamAssassin operates.


Report Safe

By default, SpamAssassin moves the original message to an attachment. This prevents running dangerous scripting which may contain viruses and makes it easy for users to identify whether a message is spam before opening it. This option is controlled with the report_safe preference, which has three options described here:

report_safe 0 Don't Use Attachments – original message is unchanged
report_safe 1 Use Attachments – original message is encapsulated in its original format and added as an attachment (default)
report_safe 2 Use Text-only Attachments – original message is converted to text and added as an attachment

Other Settings

Other settings can be enabled, disabled, or modified in the user preferences to suit your particular mail filtering requirements. There is a link on the SpamAssassin Preferences page to take you to a web site which can help you create a custom SpamAssassin profile. You will need to cut and paste the profile it creates into the Lehigh configuration box manually. When you have made your changes, click on the [Update user_prefs] button.

EXAMPLE:
required_hits 5
whitelist_from george.w.bush@whitehouse.gov
whitelist_from dailynews@cnn.com
whitelist_from *@lehigh.edu
blacklist_from *@aol.com
blacklist_from wgates@microsoft.com
report_safe 2
score SUB_FREE_OFFER 10
rewrite_subject 1
subject_tag *You've Got Spam*

This example sets the require point value to 5 points to flag incoming mail as spam. All mail from george.w.bush@whitehouse.gov, dailynews@cnn.com, and all mail originating at Lehigh should be specifically identified as being valid mail (not spam). Mail from anyone at AOL and wgates@microsoft.com should always be considered spam. If the word “free” is in the header, it is considered spam. Mail considered spam will have its header modified to include “*You’ve Got Spam*”, and the original content will be converted to text and be made an attachment.

Resetting to Default Values

If you make a mistake and have to reset to the original settings, return to the Lehigh Account Page. Go into “Configure SPAM Tagging” and click the [Reset user_prefs] button.


Configuration Tools

If you find yourself making many configuration changes to your SpamAssassin user preferences, there are some tools which can assist you.

A link on the Lehigh SpamAssassin Preferences page can help you build new user preferences using a web-based form.

Another useful tool is the free software available for download from Permission Technologies, located on the Internet at: http://www.cleanmymailbox.com/sauptool/. This tool allows Windows users to create a SpamAssassin preference file using a form. The tool supports many preferences for SpamAssassin, including the simple creation of whitelists and blacklists. The text file output can be manually cut and pasted into your SpamAssassin preferences.


If you have any questions about using SpamAssassin, please contact the LTS Helpdesk at x8HELP (610-758-4357)


Last updated:Tuesday, 23-Dec-2008 02:33:50 EST