In 2015 and 2016, I awarded “Best Couple” to 2 open supply instructions or program sorts that, mixed, make my world a greater place. This yr, the “Best Couple” prize has become the “Best Trio,” as a result of resolving the issue I got down to repair—efficient server-side electronic mail sorting—took three items of software program working collectively. Here’s how I obtained every thing to work utilizing SpamMurderer, MIMEDefang, and Procmail, three widespread and freely out there open supply software program packages.
The downside
To make managing my electronic mail simpler, I prefer to kind incoming messages into just a few folders (along with the inbox). Spam is at all times filed into the spam folder, and I have a look at it each couple of days in case one thing I would like was marked as spam. I additionally kind electronic mail from a few different sources into particular folders. Everything else is filed into the inbox by default.
A fast phrase about terminology to start: Sorting is the method of classifying electronic mail and storing it in an applicable folder. Filters like SpamAssassin classify the e-mail. MIMEDefang makes use of that classification to mark a message as spam by including a textual content string to the topic line. That classification permits different software program to file the e-mail into the designated folders. I had been utilizing these two functions, and I wanted software program to do that final bit—the one which does the submitting.
I’ve a number of electronic mail filters arrange in Thunderbird, the most effective shopper I’ve discovered for my private wants. Both my spouse and I exploit electronic mail filters on our computer systems. When we journey or use our handheld gadgets, these filters do not at all times work as a result of Thunderbird—or some other electronic mail shopper with filters—have to be operating on my pc at dwelling to be able to carry out the filtering duties. I can arrange filters on my laptop computer to kind electronic mail after I’m touring, however meaning I’ve to take care of a number of units of filters.
There was additionally a technical downside I needed to repair. Client-side electronic mail filtering depends on scanning messages after they’re deposited within the inbox. For some unknown motive, generally the shopper doesn’t delete (expunge) the moved messages from the inbox. This could also be a difficulty with Thunderbird (or it could be an issue with my configuration of Thunderbird). I’ve labored on this downside for years with no success, even by a number of full re-installations of Fedora and Thunderbird.
Additionally, spam is a serious downside for me. I’ve my very own electronic mail server, and I exploit a number of electronic mail addresses. I’ve had a few of these electronic mail accounts for a pair a long time, and so they have turn into main spam magnets. In reality, I commonly get between 1,200 and 1,500 spam emails every day—my report is simply over 2,500 spam emails in a single day—and the numbers maintain growing.
To resolve my issues, I wanted a technique for submitting emails (i.e., sorting them into applicable folders) that was server-based slightly than client-based. This would resolve a number of points: I would not want to go away an electronic mail shopper operating on my dwelling workstation simply to carry out filtering. I would not should delete or expunge messages—particularly spam—from our inboxes. And I would not must configure filters in a number of places—I’d want them in just one location, the server.
My electronic mail server
I selected Sendmail as my electronic mail server in about 1997, after I switched from OS/2 to Red Hat Linux 5, as I would already been utilizing it for a number of years at work. It’s been my mail transfer agent (MTA) ever since, for each enterprise and private use. (I do not know why Wikipedia refers to MTA as a “message” switch agent, when all my different references say “mail” switch agent. The Talk tab of the Wikipedia web page has a bit of debate about this, which generated much more confusion for me.)
I have been utilizing SpamMurderer and MIMEDefang collectively to attain and mark incoming emails as spam, putting a recognized string within the topic, ###SPAM###, in order that I can determine and kind junk electronic mail each as a human and with software program. I exploit UW IMAP for shopper entry to emails, however that isn’t a think about server-side filtering and sorting.
Yes, I exploit lots of old-school software program for the server facet of electronic mail, however it’s well-known, it really works effectively, and I perceive tips on how to make it do the issues I want it to do.
Project necessities
I consider having a well-defined set of necessities is crucial earlier than beginning a challenge. Based on my description of the issue, I created 5 easy necessities for this challenge:
- Sort incoming spam emails into the spam folder on the server facet utilizing the figuring out textual content that’s already being added to the topic line.
- Sort different incoming emails into designated folders.
- Circumvent issues with moved messages not being deleted or expunged from the inbox.
- Keep the present SpamMurderer and MIMEDefang software program.
- Make positive any new software program is straightforward to put in and configure.
This set of targets meant that I would wish a sorting program that integrates effectively with the elements I have already got.
Procmail
After in depth analysis, I settled on the venerable Procmail. I do know—extra outdated stuff—and just about unsupported nowadays, too. But it does what I want it to do and is thought to work effectively with the software program I’m already utilizing. It is steady and has no recognized severe bugs. It could be configured to be used on the system stage in addition to on the particular person consumer stage.
Red Hat and Red Hat-based distributions, comparable to CentOS and Fedora, use Procmail because the default mail delivery agent (MDA) for SendMail, so it doesn’t even have to be put in; it’s already there. My server runs CentOS, so utilizing Procmail is an actual no-brainer.
In addition to delivering electronic mail, Procmail can be utilized to filter and kind it. Procmail guidelines (often known as recipes) can be utilized to determine spam and delete or kind it into a delegated mail folder. Other recipes can determine and kind different mail as effectively. Procmail can be utilized for a lot of different issues apart from sorting electronic mail into designated folders, comparable to automated forwarding, duplication, and rather more. Those different duties are past the scope of this text, however understanding sorting ought to offer you a greater understanding of tips on how to accomplish these different duties.
How it really works
There are so some ways of utilizing SpamMurderer, MIMEDefang, and Procmail collectively for anti-spam options, so I will not go deeply into tips on how to configure them. Instead, I’ll give attention to how I built-in these three packages to implement my very own answer.
Incoming electronic mail processing begins with SendMail. I added this line to my sendmail.mc configuration file:
INPUT_MAIL_FILTER(`mimedefang', `S=unix:/var/spool/MIMEDefang/mimedefang.sock, T=S:5m;R:5m')dnl
This line calls MIMEDefang as a part of electronic mail processing. Be positive to run the make command after making any configuration adjustments to SendMail, then restart SendMail. (For extra info, see Chapter eight of SpamAssassin: A Practical Guide to Integration and Configuration.)
SpamMurderer can run as standalone software program in some functions; nevertheless, on this atmosphere, it isn’t run as a daemon, it’s known as by MIMEDefang, and every electronic mail is first processed by SpamMurderer to generate a spam rating for it.
SpamMurderer gives a default algorithm, however you may modify the scores for current guidelines, add your personal guidelines, and create whitelists and blacklists by modifying the /and many others/mail/spamassassin/native.cf file. This file can develop fairly massive; mine is simply over 70KB and nonetheless rising.
SpamMurderer makes use of the set of default and customized guidelines and scores to generate a complete rating for every electronic mail. MIMEDefang makes use of SpamMurderer as a subroutine and receives the spam rating as a return code.
MIMEDefang is programmed in Perl, so it’s straightforward to hack. I’ve hacked the final main portion of the code in /and many others/mail/mimedefang-filter to offer a filtering breakdown with just a little extra granularity than the default. Here’s how this part of the code appears on my set up (I’ve made important adjustments to this portion of the code, so yours in all probability won’t look very like this):
#####################################################################
# Determine tips on how to deal with the e-mail based mostly on its spam rating and #
# add an applicable X-Spam-Status header and alter the topic. #
#####################################################################
# Set required_hits in sa-mimedefang.cf to get worth for $req #
#####################################################################
if ($hits >= $req)
action_add_header("X-Spam-Status", "Spam, rating=$hits required=$req assessments=$names");
action_change_header("Subject", "####SPAM#### ($hits) $Subject");
action_add_part($entity, "text/plain", "-suggest", "$reportn", "SpamAssassinReport.txt", "inline");
# action_discard();
elsif ($hits >= eight) elsif ($hits >= 5) elsif ($hits >= zero.00) else
# If rating (hits) is lower than or equal to zero
action_add_header("X-Spam-Status", "No, rating=$hits required=$req assessments=$names");
# action_add_part($entity, "text/plain", "-suggest", "$reportn", "SpamAssassinReport.txt", "inline");
Here’s the road in that code that adjustments the topic line of the e-mail:
action_change_header("Subject", "####SPAM#### ($hits) $Subject");
Actually it calls one other Perl subroutine to alter the topic line utilizing the string I wish to add as an argument, however the impact is identical. The topic line now accommodates the string ####SPAM#### and the spam rating (i.e., the variable $hits). Having this recognized string within the topic line makes additional filtering straightforward.
The modified electronic mail is returned to SendMail for additional processing, and SendMail calls Procmail to behave because the MDA.
Procmail makes use of world and user-level configuration information, however the world /and many others/procmailrc file and particular person consumer ~/.procmailrc information have to be created. The construction of the information is identical, however the world file operates on all incoming electronic mail, whereas native information could be configured for every particular person consumer. Since I do not use a world file, all of the sorting is finished on the consumer stage. My .procmailrc file is straightforward:
# .procmailrc file for david@each.org
# Rules are run sequentially - first match winsPATH=/usr/sbin:/usr/bin
MAILDIR=$HOME/mail #location of your mailboxes
DEFAULT=/var/spool/mail/david# Send Spam to the spam mailbox
# This is my new type SPAM topic
:zero
* ^Subject:.*####SPAM####
$HOME/spam# Political stuff goes right here. Must be utilizing my political electronic mail deal with
:zero
* ^To:.*political
$HOME/Political# SysAdmin stuff goes right here. Usually system log messages
:zero
* ^Subject:.*(Logwatch|rkhunter|Anacron|Cron|Fail2Ban)
$HOME/AdminStuff# drops messages into the default field
:zero
* .*
Note that the .procmailrc file have to be positioned in my electronic mail account’s dwelling listing on the e-mail server, not within the dwelling listing on my workstation. Because most electronic mail accounts are usually not login accounts, they use the nologin program because the default shell, so an admin should create and preserve these information. The different possibility is to alter to a login shell, comparable to Bash, and set passwords in order that educated customers can log in to their electronic mail accounts on the server and preserve their .procmailrc information.
Each Procmail recipe begins with :zero (sure, that could be a zero) on the primary line and accommodates a complete of three traces. The second line begins with * and accommodates a conditional assertion consisting of an everyday expression (regex) that Procmail compares to every line within the incoming electronic mail. If there’s a match, Procmail kinds the e-mail into the folder specified by the third line. The ^ image denotes the start of the road when making the comparability.
The first recipe in my .procmailrc file kinds the spam recognized within the topic line by MIMEDefang into my spam folder. The second recipe kinds political electronic mail (recognized by a particular electronic mail deal with I exploit for my volunteer work for varied political organizations) into its personal folder. The third recipe kinds the large quantity of system emails I obtain from the numerous computer systems I cope with right into a mailbox for my system administrator duties. This setup makes these emails very straightforward to seek out.
Note the usage of parentheses to surround a listing of strings to match. Each string is separated by a vertical bar, aka the pipe ( | ), which is used as a logical “or.” So the conditional line
* ^Subject:.*(Logwatch|rkhunter|Anacron|Cron|Fail2Ban)
reads, “if the Subject line contains Logwatch or rkhunter or … or Fail2Ban.” Since Procmail ignores case, there isn’t a must create recipes that search for varied combos of higher and decrease case.
The final recipe drops all electronic mail that doesn’t match one other recipe into the default folder, normally the inbox.
Having the .procmailrc file in my dwelling listing doesn’t trigger Procmail to filter my mail. I’ve so as to add yet one more file, the next ~/.ahead file, which tells Procmail to filter all of my incoming electronic mail:
# .ahead file
# course of all incoming mail by procmail - see .procmailrc for
# the filter guidelines.
|/usr/bin/procmail
It shouldn’t be essential to restart both SendMail or MIMEDefang when creating or modifying the Procmail configuration information.
For extra element in regards to the configuration of Procmail and creation of recipes, see the SpamAssassin book and the Procmail info within the RHEL Deployment Guide.
A couple of further notes
Note that MIMEDefang have to be began first, earlier than SendMail, so it could actually create the socket the place SendMail sends emails for processing. I’ve a brief script (automate every thing!) I exploit to cease and restart SendMail and MIMEDefang within the right order in order that new or modified guidelines within the native.cf file take impact.
I have already got a big physique of guidelines and rating modifiers in my SpamMurderer native.cf file so, though I might have used Procmail by itself for spam filtering and sorting, it might have taken lots of work to transform all of these guidelines. I additionally suppose SpamMurderer does a greater job of scoring as a result of it doesn’t depend on a single rule to match, however slightly the mixture rating from all the principles, in addition to scores from Bayesian filtering.
Procmail works very effectively when matches could be made very express with recognized strings, comparable to those I’ve configured MIMEDefang to position within the topic line. I feel Procmail works higher as a ultimate sorting stage within the spam-filtering course of than as an entire answer by itself. That stated, I do know that many admins have made full spam filtering options utilizing nothing greater than Procmail.
Now that I’ve server-side filtering in place, I’m considerably much less restricted in my selection of electronic mail purchasers, as a result of I now not want a shopper that performs filtering and sorting. Nor do I want to go away an electronic mail shopper operating on a regular basis to carry out that filtering and sorting.
Reports of Procmail’s demise are vastly exaggerated
In my analysis for this text, I discovered plenty of Google outcomes (courting from 2001 to 2013) that declared Procmail to be useless. Evidence contains damaged net pages, lacking supply code, and a sentence on Wikipedia that declares Procmail to be useless and hyperlinks to newer replacements. However, all Red Hat, Fedora, and CentOS distributions set up Procmail because the MDA for SendMail. The Red Hat, Fedora, and CentOS repositories all have the supply RPMs for Procmail, and the supply code can be on GitHub.
Considering Red Hat’s continued use of Procmail, I’ve no downside utilizing this mature software program that does its job silently and with out fanfare.