Courier::Filter::Overview - Architectural and administrative overview of
Courier::Filter
Courier::Filter is a purely Perl-based mail filter framework for the Courier
MTA.
Courier offers an interface for daemon-style processes to act as mail filters,
called
courierfilters. For every incoming mail message, right after the
DATA command in the SMTP transaction phase has completed, Courier calls every
registered mail filter through a UNIX domain socket the filter is listening
on, and feeds it the file names of the incoming message and one or more
control files. The mail filter processes the message and its control file(s),
and returns an SMTP-style status response. If the status response is a
positive ("2xx") one, Courier accepts the message. Otherwise,
Courier rejects the message using the returned response code and text. For
details about the courierfilter interface, see courierfilter.
Courier::Filter implements the courierfilter interface as a framework for mail
filter modules that frees modules from the duties of creating and handling the
UNIX domain sockets, waiting for connections from Courier, and reading and
parsing message and control files. Thus, authors of filter modules can
concentrate on writing the actual filter logic without having to care about
things that can easily be abstracted and can be performed by the framework.
Courier::Filter allows multiple filter modules to be installed, and filter
modules can be stacked and grouped hierarchically, and even a module's
polarity can be reversed, so some modules can be used for explicitly
accepting messages while others are used in the traditional way for
rejecting messages.
There are some alternative implementations of the courierfilter interface:
- Writing your own standalone courierfilter
- If you need an ultra-high performance mail filter, writing
a standalone courierfilter in C/C++ is a good choice. You will have
maximum freedom for optimizing your filter for performance and resource
consumption. But regardless of which language you use, you will have to
implement all the UNIX domain socket and connection handling and message
and control file processing yourself. And you don't get the modularity and
module grouping capabilities for free either.
- courierperlfilter
- Courier brings a sample Perl-based courierfilter called
courierperlfilter. It is a C executable that employs Perl embedding
(see perlembed) to execute a Perl script for every incoming message, which
is about as performant as the purely Perl-based Courier::Filter. But for
every Perl-based courierfilter you want to run, you have to use a separate
instance of courierperlfilter, or implement your own modularity and module
grouping. Also, the included template Perl script is not very modular in
itself, and Courier::Filter's message and control file parsing features
are missing.
- pythonfilter
-
pythonfilter by Gordon Messmer is a purely
Python-based, modular, threaded courierfilter framework, similar to
Courier::Filter. If you primarily speak Python, this is clearly your
choice. pythonfilter also provides infrastructure to filter modules for
modifying messages, even with versions of Courier prior to 0.57.1, which
did not directly allow global mail filters to modify messages. As of
version 1.1, pythonfilter supports only a linear topology in the
configuration of its filter modules.
First, Courier::Filter (of course) and the filter modules that you plan to use
for filtering your incoming mail need to be installed somewhere in your Perl
include path (see the last lines of `perl -V`). You may also need to adjust
the
Courier::Config class (in "Courier/Config.pm") to reflect
your system's paths. As for the filter modules, you can either use prepared
ones (see "Bundled Courier::Filter modules" for a list of modules
that come with Courier::Filter), or you can write your own (see "How
filter modules work").
Second, you need to create a configuration file for Courier::Filter.
Courier::Filter usually seeks for it at
"/etc/courier/filters/courier-filter-perl.conf" (see Courier::Config
on how to configure that). This file is a Perl snippet that must
"use" the filter modules you want to use, and then fill in the
$options global variable with the desired configuration options, instantiating
filter modules and loggers as required.
If you plan to use non-ASCII string literals in your configuration file, it
should be encoded in UTF-8 (which is the native internal character encoding of
Perl 5.8+ and Courier::Filter), and if it is, it
must do "use
utf8". (It is possible for the configuration file to be encoded
differently, but you still
must explicitly specify the used encoding,
see encoding for how to do that.)
For example, this is how a simple configuration file could look like:
use utf8;
use Courier::Filter::Logger::File;
use Courier::Filter::Module::Header;
$options = {
logger => Courier::Filter::Logger::File->new(
file_name => '/var/log/courier-filter-perl.log',
timestamp => 1
},
modules => [
Courier::Filter::Module::Header->new(
fields => {
subject => qr/fuzzybuzzy/
},
response => 'No fuzzybuzzy, please!'
)
]
};
These options will be used when creating the "Courier::Filter" object.
For a detailed explanation of supported configuration options and how filter
modules can be grouped, even hierarchically, see "
new()" in
Courier::Filter.
Third, you need to make Courier aware of Courier::Filter by installing a symlink
in "/etc/courier/filters/active/" pointing to the
"courier-filter-perl" executable (which is used for bootstrapping
Courier::Filter):
$ ln -s $PATH_TO/courier-filter-perl /etc/courier/filters/active/
Finally, you may start (or restart) Courier::Filter (including any other
installed courierfilters; Courier must of course already be running):
$ sudo courierfilter restart
In syslog, you should see the following message and no further error messages:
Jan 24 01:42:15 yourhost courierfilter: Starting courier-filter-perl
Any errors occurring while Courier::Filter is running will appear in syslog as
well. A broken filter module will not crash Courier::Filter, but will record
any Perl error messages in syslog, and
reject incoming mail messages
with a
temporary status code, so as not to enable attackers to
circumvent the configured Courier::Filter mail filtering.
Filter modules are Perl classes that are derived from the class
Courier::Filter::Module. See perlobj for an explanation of Perl's
object orientation features.
Filter modules are to be instantiated in the
"courier-filter-perl.conf" configuration file, with either
normal
polarity (the default) or
inverse polarity. Then, for every
incoming mail message, Courier::Filter asks each configured filter module in
turn for consideration of the message's acceptability.
Every module tries to match its filter criteria against the current message,
yielding a so-called
match result, which can be either an
explicit
match, an
implicit mismatch, or an
explicit mismatch.
(Filter modules usually never return an
explicit mismatch, but only an
implicit one; see "Writing filter modules" if you want to
know why.)
According to the filter module's polarity, the match result is then translated
into a so-called
acceptability result, which can be either an
explicit reject, an
implicit accept, or an
explicit
accept.
This is how
match results are translated into
acceptability
results under normal and inverse polarity:
polarity | match result | acceptability result
----------+-------------------+----------------------
| explicit match | explicit reject
normal | implicit mismatch | implicit accept
| explicit mismatch | explicit accept
----------+-------------------+----------------------
| explicit match | explicit accept
inverse | implicit mismatch | implicit accept
| explicit mismatch | explicit reject
Generally, Courier::Filter interprets the acceptability result as follows:
- •
- If a module states an explicit reject for the
current message, Courier::Filter aborts the consideration process and
rejects the message.
- •
- If a module states an implicit accept,
Courier::Filter continues the consideration process with the next module
in turn.
- •
- If a module states an explicit accept,
Courier::Filter skips the rest of the group of modules and assumes the
whole group to be an implicit accept.
If no
explicit reject has occured when Courier::Filter finishes asking
all filter modules, the message is accepted.
(For details on how to use advanced
filter module grouping, see the
description of the "modules" option in "
new()" in
Courier::Filter.)
Abstracting the concept of a "match" from the concept of
"acceptance" makes it possible to use filter modules with normal
polarity for "black-listing" certain message characteristics, and
filter modules with inverse polarity for "white-listing", while
still allowing all modules to be written in a uniform sense of logic. That is,
there are no dedicated "accepting" and "rejecting"
modules, but only "matching" modules. (E.g. there are no
"HeaderAccept" and "HeaderReject" modules, but only a
"Header" module.)
The main objective of Courier::Filter is to make it very easy to write new
filter modules, so while the previous section described how filter modules
work in general, we will now look at the details of writing your own filter
modules. From here on you really should know what you are doing, so if you are
not familiar with Perl's object orientation features, now is the time to read
perlobj plus any documents referenced from there.
As already mentioned, filter modules are Perl classes derived from the class
Courier::Filter::Module, which is an abstract base class and thus
cannot be instantiated itself.
To ask a filter module for consideration of the message, Courier::Filter calls
"$module->consider()", passing a
Courier::Message object.
"$module->consider()" (if not overrided from
Courier::Filter::Module) then calls "$module->match()",
passing through the message object.
The "match()" method really is where a filter module decides whether a
message matches the filter criteria, and this is usually the only method of
Courier::Filter::Module that needs to be overrided. That method may use
any configuration information from the filter module object (see
Courier::Filter::Module, and of course your own class), and any information
from the message object (see Courier::Message).
If a filter module wants to call external commands using "system()",
or functions from Perl modules that directly operate on files, it can
efficiently bypass the message and control files processing features of
Courier::Message by using the message object's "control_file_names"
and "file_name" properties only.
Finally, after the message has been examined, "match()" must return a
match result of...
- true
- if the module wants to state an explicit match (the
first return value being the SMTP status response text, an optional
second one being the SMTP status response code),
- undef
- if the module wants to state an implicit mismatch,
that is, indifference of whether the message should be accepted or
rejected,
- false
- if the module wants to state an explicit
mismatch.
"consider()" then translates the
match result into a
acceptability result as described in "How filter modules
work".
"match()" should usually never return an
explicit mismatch
(
false), but an
implicit one (
undef) instead for the
message to pass
this filter module, while still allowing any
further modules to
explicitly reject (under normal polarity) or accept
(under inverse polarity) the message. See the description of the
"modules" option in "
new()" in Courier::Filter for
specifics on how Courier::Filter uses acceptability results.
Now let's see in practice how to write a simple filter module. For instance, we
will create a simple variant of the "Header" module that matches a
specified message header field against a specified string. Let's call it
"HeaderSimple".
First, we create a Perl module for the class
Courier::Filter::Module::HeaderSimple, with the file name
"Courier/Filter/Module/HeaderSimple.pm". (That is, you need to
install the file "HeaderSimple.pm" in the proper place in your Perl
include path.)
Second, in that Perl module, we state the package/class name, and the name of
the base class, which is usually
Courier::Filter::Module:
package Courier::Filter::Module::HeaderSimple;
use base qw(Courier::Filter::Module);
Third, we override the "match()" method by defining a rudimentary
"match" sub:
sub match {
my ($self, $message) = @_;
# ...
}
The first argument of the "match()" method is (as usual in Perl's
object orientation model) the module object itself, which provides access to
its configuration options. The second argument is the message object that is
to be examined. The ellipsis ("...") is where we will place our own
filter logic.
Now, we expect our module to be instantiated like this:
Courier::Filter::Module::HeaderSimple->new(
field => 'subject',
value => 'viagra',
response => 'Go away, spammer!'
)
which makes the configuration options available from the hash keys
"$self->{field}", "$self->{value}", and
"$self->{response}".
We want to test whether the configured header field of the message matches the
configured value, and if so, return the configured response, so we write:
return $self->{response}
if $message->header($self->{field}) =~ m/\Q$self->{value}\E/;
return undef;
# otherwise.
That's it. This is how the complete filter module looks like:
package Courier::Filter::Module::HeaderSimple;
use base qw(Courier::Filter::Module);
sub match {
my ($self, $message) = @_;
return $self->{response}
if $message->header($self->{field}) =~ m/\Q$self->{value}\E/;
return undef;
# otherwise.
}
You may dry-test filter modules using the "test-filter-module"
utility. See its manpage for details.
You may also switch any or all installed filter modules into "testing"
mode so you can test them without risking messages being actually rejected.
See "
new()" in Courier::Filter and "
new()"
in Courier::Filter::Module.
The following prepared filter modules are included with this version of
Courier::Filter:
- BlankBody
- Detection of messages with blank bodies (symptom of stupid
spammers)
- DNSBL
- Checking of the calling MTA's IP address against one or
more DNS black-lists
- SPF
- SPF (Sender Policy Framework) authorization checking of the
calling MTA's IP address against the envelope sender domain (classic
inbound SPF checking)
- SPFout
- SPF authorization checking of the local system's IP address
against the envelope sender domain (so-called outbound SPF checking)
- Envelope
- Literal and reg-exp matching of one or more RFC 2821
message envelope fields
- Header
- Literal and reg-exp matching of one or more RFC 2822
message header fields
- FakeDate
- Detection of implausible and malformed "Date" and
"Resent-Date" header fields
- ClamAVd
- Malware detection using the ClamAV anti-virus scanner
- SpamAssassin
- Spam detection using SpamAssassin
- Parts
-
MIMEParts (DEPRECATED)
- Size and MD5 sum matching of message (MIME multipart and
ZIP archive) parts
- SendCopy
- Pseudo-filter for sending message copies to additional
recipients
The following prepared loggers are included with this version of
Courier::Filter:
- IOHandle
- Logging to I/O handles
- Syslog
- Logging to syslog (based on the IOHandle
logger)
- File
- Logging to files (based on the IOHandle logger)
courier-filter-perl, test-filter-module, Courier::Filter,
Courier::Filter::Module, Courier::Message, Courier::Config
- The courierfilter interface
- <http://www.courier-mta.org/courierfilter.html>
- courierperlfilter
- <http://www.courier-mta.org/courierperlfilter.html>
- pythonfilter
- <http://phantom.dragonsdawn.net/~gordon/courier-patches/courier-pythonfilter/>
The latest version of Courier::Filter is available on CPAN and at
<
http://www.mehnle.net/software/courier-filter>.
Support is usually (but not guaranteed to be) given by the author, Julian Mehnle
<
[email protected]>, preferably through the Courier MTA's courier-users
mailing list <
[email protected]>, which is
subscribable through
<
http://lists.sourceforge.net/lists/listinfo/courier-users>.
Courier::Filter is Copyright (C) 2003-2008 Julian Mehnle
<
[email protected]>. All rights reserved.
Courier::Filter is free software. You may use, modify, and distribute it under
the same terms as Perl itself, i.e. under the GNU GPL or the Artistic
License.