Stopping SPAM with Bayesian Filters OS X Support
To do a better job of spotting spam, Mr Graham came up with a different technique that means he hardly ever sees junk mail anymore. "For me and all my friends spam is a solved problem."
Everything I read points to Bayesian Filters as the way to go for SPAM filtering.
If you do go that route you will need a lot of SPAM and a lot of Known Good Emails. If you consider this method, you might want to start saving your SPAM now. Put it in a separate folder, or ask your employees to forward all SPAM to a central address just for found spam...
a comprehensive, flexible email management solution enabling organizations to take control of their email traffic by protecting against spam and virus threats while enforcing other vital communication policies at the email gateway.
POPFile is an automatic mail classification tool. Once properly set up and trained, it will work in the background of your computer, scanning mail as it arrives and filing it however you wish. You can give it a simple job, like separating out junk e-mail, or a complicated one - like filing mail into a dozen folders. Think of it as a personal assistant for your inbox.
CRM114 - the Controllable Regex Mutilator
CRM114 is a system to examine incoming e-mail, system log streams, data files or other data streams, and to sort, filter, or alter the incoming files or data streams according to whatever the user desires. Criteria for categorization of data can be by satisfaction of regexes, by sparse binary polynomial matching with a Bayesian Chain Rule evaluator, or by other means. Accuracy of the SBPH/BCR classifier has been seen in excess of 99 per cent, for 1/4 megabyte of learning text. In other words, CRM114 learns, and it learns fast .
If picking one this would be on my short list.
SpamProbe - A Fast Bayesian Spam Filter
Fast, intelligent, automatic spam detector using Paul Graham style Bayesian analysis of word counts in spam and non-spam emails. Filtering adapts to personal tastes automatically. No manual rule creation required. Intended for use with procmail, maild [...] SpamProbe is known to compile and run on a wide range of *nix systems including GNU/Linux (RedHat and Debian), FreeBSD, Solaris, AIX, MacOS X, and Darwin. (If you compile and run SpamProbe on a system not mentioned here please notify me so that I can add it to the list!) SpamProbe requires another program to actually label and file spams. procmail, maildrop are popular systems for this purpose. You must have Berkeley DB installed on your computer before you can compile SpamProbe. The README file contains information about configuring procmail to work with SpamProbe.
Bogofilter is a mail filter that classifies mail as spam or ham (non-spam) by a statistical analysis of the message's header and content (body). The program is able to learn from the user's classifications and corrections.
The SpamBayes project is working on developing a Bayesian anti-spam filter, initially based on the work of Paul Graham. The major difference between this and other, similar projects is the emphasis on testing newer approaches to scoring messages. While most anti-spam projects are still working with the original graham algorithm, we found that a number of alternate methods yielded a more useful response. This is documented on the background page.
See also Spambayes on Unix or GNU/Linux for help with OS X support.
Also Paul Graham's followup Better Bayesian Filtering
But the real advantage of individual filters is that they'll all be different. If everyone's filters have different probabilities, it will make the spammers' optimization loop, what programmers would call their edit-compile-test cycle, appallingly slow. Instead of just tweaking a spam till it gets through a copy of some filter they have on their desktop, they'll have to do a test mailing for each tweak. It would be like programming in a language without an interactive toplevel, and I wouldn't wish that on anyone.
This is a very indepth list, practically no comments but FWIW Server Side Spam Filtering
0 TrackBacks
Listed below are links to blogs that reference this entry: Stopping SPAM with Bayesian Filters OS X Support.
TrackBack URL for this entry: http://kennethhunt.com/mt/mt-tb.cgi/772
Can you tell me where I can obtain CRM 114 to weed out spam from my e mail in box.
Also - please - how do I stop web based spam. One very annoying spam comes in when I try and read my bank details - it's one of those pormographic things and I'm sick of receiving iy
Cheers
Bernard Hunt (UK)
take a look at Armoring Spam Against Anti-Spam Filters: http://yro.slashdot.org/article.pl?sid=04/02/04/1457250&mode=flat&tid=111&tid=126&tid=95
Which covers work to make filters more efficient by studying ways to trick existing methods. It looks like the war with spam is only starting...