K9 E-mail Filter and Blacklist

This page exists solely to provide links related to Robin Keir's excellent and free K9 E-mail Filter, which may be of use to our clients.

About this Blacklist

A blacklist is, technically, a list of senders from whom you do not wish to receive e-mail messages. In K9, this meaning is extended significantly: a K9 blacklist is a list of rules - relating to sender, subject, content, etc - defining messages you do not wish to receive.

My blacklist file is provided at the link above, in the hope that it may be useful. I receive nearly 500 messages per day, of which approximately 85% are spam and 13% come from several mailing lists to which I subscribe. That leaves only 2% of my traffic as non-mailing-list "ham," or good e-mails. The blacklist I use catches 86.0% of my spam (77.7% of total message volume), leaving only 14.0% of the spam (12.7% of total volume) and a very small amount of non-spam for statistical processing by K9's Bayesian filter.

[NOTE: The above information has been changing rapidly. My spam mix has gotten significantly less predictable, so the blacklist is currently catching down from its peak at catching over 90% of my spam. That's the bad news; the good news is that I am getting fewer spam messages, total (down from almost 600/day at the first of the year to about 490, now), and spammers are apparently having to get a little more creative. The spam tactics which are currently popular are still no match for a Bayesian filter, but at least a few spammers appear to be getting a little more devious.]

Overall, if you train K9 properly, you should be able to get 99.5% or better, with only about 0.1% false positives. With this blacklist and a corresponding whitelist, I am getting far better than 99.9% accuracy. These are just my statistics; your results may vary, depending on the composition of your e-mail traffic. Actual stats through 11:48 AM, 4/24/2004:

  Since Thu Apr 01 2004 02:30:03 PM
(23 days)
Since Fri Feb 13 2004 12:45:25 PM
(71 days)
Total number of emails processed 11,116
(486/day)
31,049
(438/day)
Number of Good emails processed 1,063
(9.56%)
4,511
(14.53%)
Number of Spam emails processed 10,053
(90.44%)
26,538
(85.47%)
Percentage of emails that matched whitelist rules 8.88% 13.83%
Percentage of emails that matched blacklist rules 77.74% 63.76%
Number of emails re-classified to Good 0 2
Number of emails re-classified to Spam 2 12
Percentage emails misidentified as Spam (false positives) 0.000% 0.023%
Percentage emails misidentified as Good (false negatives) 0.018% 0.039%
Overall accuracy 99.982% 99.955%

DISCLAIMER: These stats are SLIGHTLY inflated through my whitelisting policy (though not intentionally), as three special cases were false positives not reflected in these stats:

If you want to be really picky, then, my accuracy since April 1 was actually about 99.901%. All of the above cases are now handled correctly, though; this goes to show the importance of using a whitelist if you're going to use an aggressive blacklist.

The blacklist offered makes extensive use of regular expressions, or regexes. You need to download the PCRE Regular Expression engine in order to use regexes in K9.

Note: I have not provided my personal whitelist, because it wouldn't do anybody else any good. The people, groups, and lists whose e-mails I always want are most likely not those people, groups, and lists that interest you. The best way to use this blacklist is by developing your own whitelist, using the context menus (right click menus) in K9.

 

all content ©1999-2008 Ed Cottrell
Web Design by Topsail Consulting
Web Design by Topsail Consulting
Last Updated: July 25, 2008
this site best viewed in Mozilla FireFox
 
 

 
 

Quote of the Picosecond:

"If we wait until we've satisfied all the uncertainties, it may be too late."

-Lee Iacocca (1924 - ),
American automobile executive
 
 
Want to include a free "Quote of the Picosecond" on your site? Learn how here.

 
 
 

View Ed Cottrell's profile on LinkedIn

Current Reading

The Smartest Guys in the Room - Bethany McLean and Peter Elkind
The Smartest Guys in the Room - Bethany McLean and Peter Elkind

The Informant - Kurt Eichenwald
The Informant - Kurt Eichenwald

Getting to Maybe - Richard Michael Fischl and Jeremy Paul
Getting to Maybe - Richard Michael Fischl and Jeremy Paul

Super System - Doyle Brunson
Super System - Doyle Brunson

My Name is Asher Lev - Chaim Potok
My Name is Asher Lev - Chaim Potok

 

Current Listening

Nickel Creek - Nickel Creek
Nickel Creek - Nickel Creek

Why Should the Fire Die? - Nickel Creek
Why Should the Fire Die? - Nickel Creek

Weather and Water - The Greencards
Weather and Water - The Greencards

The Tiny Bell Trio - Dave Douglas
The Tiny Bell Trio - Dave Douglas

Hell Among the Yearlings - Gillian Welch
Hell Among the Yearlings - Gillian Welch