So, you like the little slider for Spam in Server Admin? It’s pretty cool, but there’s a bunch of stuff you can do to ‘enhance its performance’…
Read on for more…So, where to start?
SpamAssassin (at the risk of serious confusion with Server Admin, hereafter known as SA), along with clamAV, is tied into the Mail service in Tiger through AMaViS (‘A’ ‘Ma’il ‘Vi’rus ‘S’canner). Let’s look at the life of a typical email…
Briefly, mail comes in, is handed off by Postfix to AMaViS which scans the content before handing it back to Postfix which then passes it along to Cyrus for delivery. During this scanning, a whole bunch of rules are parsed and if the contents of the email match a particular rule the score for that rule is added to message’s total score. When all the rules have been parsed the total score is compared to 3 levels defined in amavisd.conf in order to decide what happens to the email. By assigning small scores to lots of tests, we can be much more confident in the final conclusion of the message’s nature. This is actually what Bayesian means – degree of belief.
Levels of Spam
Yes, SA actually has 3 levels for tagging messages, only one is made accessible through Server Admin – all of this is configured in /etc/amavisd.conf:
<code> $sa_tag_level_deflt </code>
the score at which spam info (X-Virus-Scanned:, X-Spam-Status:, and X-Spam-Level:) headers are added, defaults to -999
<code> $sa_tag2_level_deflt </code>
the score at which Spam Detected headers are added to the message – this is the level that is configurable from Server Admin
<code> $sa_kill_level_deflt </code>
the level at which the value of $final_spam_destiny determines what happens to the message. By default this is set to be the same as the tag2 value.
So, with Server Admin we can set the tag2 level and decide if messages at this level are simply tagged, redirected, bounced or deleted. These four options are linked to the options for
<code> $final_spam_destiny D_PASS, D_REJECT, D_BOUNCE, D_DISCARD </code>
If this is good enough for your setup, you can stop reading here! However, having only one setting can be a problem and lots of messages with scores just above or below this value may be getting incorrectly labeled, or worse, deleted.
By making some changes to amavisd.conf we can use a multi-value model…beware that changing the slider and the drop-down for what should happen to messages will overwrite any changes you have made. The following will set up SA to add header info and append a Subject warning at a score of 3 and will reject all messages with a score above 8 – I’ve never had a genuine message scoring over 8. Feel free to change the values to something more appropriate to what you’re seeing.
As always, take a backup of amavisd.conf before you start and use a test box if you can. You’ll need to use sudo to edit /etc/amavisd.conf…
<code> $final_spam_destiny value = D_REJECT; </code>
<code> $sa_tag2_level_deflt = 3.0; $sa_kill_level_deflt = 8; </code>
That’s it! If you want to be totally ruthless you can use D_DISCARD instead of reject which should send a delivery failure notification…I say should as most spam, of course, comes from non-existant addresses.
Setting up SA this way gives you some freedom,and lets your users decide what is and isn’t spam. By setting up the 2 accounts – junkmail and notjunkmail – you can have your users redirect incorrectly tagged messages appropriately (spam that wasn’t tagged to junkmail, and genuine messages tagged as spam to notjunkmail).
Take a peek inside /usr/share/spamasssassin and you’ll see a fair cornucopia of pre-defined rules looking at everything from malformed headers to drug references. Now, we can’t go editing any of these, as the SpamAssassin project, not Apple, will blankly overwrite these default rules with new releases. The correct file, if you want to create your own rules is /etc/mail/spamassassin/local.cf
The Anatomy of a rule
A rule has 3 sections:
(i) the rule – its type, an identifier, and the regular expression
(ii) a description
(iii) a score
Let’s take a look at a simple example – it’s a good idea to prefix your own rules with L_ so you can easily see them – keep the name uppercase and less than 22 characters. If you’re not sure about the effect your rule might have you can prefix with T_ and SA will assign a score of only 0.1 to it, but you’ll still be able to see the rule was used in the headers. Once you’re happy, remove the T_ and SA will assign the real score to it.
<code> header L_SUBJECT_FREE Subject =~ /Free/ describe L_SUBJECT_FREE Block messages about Free stuff score L_SUBJECT_FREE 10.0 </code>
This rule checks the headers for a Subject line containing the word Free (with a capital F). If it finds it, a score of 10 is added to the total score for that email.
Other options for header include: From, To, CC, Received or All.
We can also look at the body, rawbody (html tags and line breaks left intact), or URI (looking at html links in messages)
Let’s look at a slightly more useful example:
<code> body L_SUBMIT /(submit.*(search|engines?|opt-in))/i describe L_SUBMIT Submit site to engines score L_SUBMIT 3.0 </code>
What we’re doing here is scanning the body of the message for the occurence of the word ‘submit’ followed by any number of characters or whitespace followed by either the word ‘search’ or ‘engine’ or ‘engines’ or ‘opt-in’. We don’t care about case either (/i). We give any such message a score of 3.
This example introduces *some* of the many character searching mechanisms within Regular Expressions (regex). In this case we’re using the full stop/dot/period, the asterisk/star and the question mark. But what do they all mean??? Here’s a brief run-down…
* – preceding character can appear any number of times (inc. zero)
+ – preceding character can appear any number of times (but NOT zero times)
? – preceding character can appear zero or once only (perhaps the story title makes more sense now?)
. – matches exactly one character
And if you need to search for any of these literals and others, like the @ symbol, you’ll need to prefix with a backslash
Make sense? Good – that’s just the tip of the iceberg. Regex has ways of searching for:
a digit – d
a non-digit – D
a word character (a letter, number or the _ character) – w
a non-word character – W
a whitespace ‘character’ – s
any non-whitespace character – S
There’s a whole bunch more to get carried away with, if you’re feeling brave check out this PERL Regular Expressions site or trawl through the default rules and try to work out what they’re doing – it’s a pretty good way to learn!
Anyway, once you’ve put your rule into local.cf it’s a really good idea to run spamassassin –lint to check the syntax of your file. It’s really easy to miss out a period or ) and mess things up. The –lint argument will parse the file and report any problems. You can safely ignore the 5 lines that give error messages – they’re old SA commands that have been deprecated so the syntax is not understood.
A note on scoring: try not to think of rules only for adding scores to spam – you can just as easily write a rule with a negative score to allow messages through.
We know about blacklists – there’s a whole bunch of ’em out there (Spamhaus is a decent one) – but SA maintains its own lists, some automatcially generated and others that are fixed. Let’s say you want to whitelist a particular account and never have mail from that account scanned.
One way of doing this is using the whitelist feature:
Open up amavisd.conf as root and search for whitelist_sender_re
Scroll down to the first free line and type
<code> $whitelist_sender_re = new_RE( email@example.com$'i ); </code>
This will ensure that when Steve writes to you, it won’t get blocked. You should also be able to store your whitelisted accounts in a file by uncommenting:
<code> read_hash(%whitelist_sender, '/var/amavis/whitelist_sender'); </code>
and sticking the accounts, one-per-line, into a file called whitelist_sender in /var/amavis but I’ve not tested this yet.
Finally, there is a section of entries starting with $spam_lovers – here you can enter local user accounts (there are some great examples to follow in this file) that don’t need or don’t want mail scanned for spam. There’s a similar section for accounts that don’t need virus scanning. While we’re on the subject of viruses…
By default double extensions are picked up and blocked (checkthisout.pdf.exe for example). But one other thing we, as responsible sysadmins can do is block emails containing executable attachments. In the Windows world there are some pretty long lists of what to block:
So we should thank ourselves lucky that amavisd.conf already has a pretty extensive list of extensions that only require uncommenting! Take a look for $banned_filename_re and uncomment at will. You can also easily add other types into the existing lists.
Happy blocking. If you want to explore custom rules some more, the wiki here is a great place to start…