Spam Detection Rates, What the Numbers Don’t Tell You

Anyone who has looked for an email anti-spam solution is probably familiar with spam capture rate statistics.  You’ve no doubt seen claims such as “Blocks 99.9% of spam” but what the capture rate doesn’t tell you is going to prove even more important to your overall filtering satisfaction.

While some spam campaigns are innovative and can be difficult for various filtering systems to catch,  stopping most spam email is not too difficult. The real challenge is NOT filtering the good mail that end users want to receive in the process.

A spam filter’s claimed capture rate means nothing if you do not know their false-positive rate.  False positives are good emails caught by the filters and marked as spam.  A great capture rate will not be acceptable to the end-user if it comes with a high false-positive rate. The cost of lost opportunities and delayed responses to legitimate mail will exceed the benefit provided by the blocking of spam.

This aspect of filtering is so important that false-positive rates are the single most important factor the OnlyMyEmail Anti-Spam team considers when evaluating new filtering tactics and techniques.

To further complicate matters, not all false-positives are created equal. As long as users can easily train the filter to allow such emails in the future, they  are pretty tolerant when it comes to blocking commercial emails such as a Costco, Expedia or Bank of America advertisements.  They won’t even complain much if you occasionally trap high volume newsletters from ESPN, Motley Fool or MSNBC.

But if your filtering blocks legitimate user-to-user or critical automated e-commerce emails then the solution will be deemed unacceptable. Because of this distinction, capture rates must be balanced not only against potential false-positive rates in general, but “critical” false-positive rates as well.

Finally, in order to provide a positive end-user experience, the filtering system needs to be able to accommodate individual end-user preferences with minimal human intervention. Because not all users will agree on a single definition of spam, the best solutions allow input from users as to the types of email they individually want to receive, as well as those they want blocked.

For example, if you ask a handful of users about the the types of commercial emails referenced above, those from Costco, Expedia or ESPN, you will find that while many consider such emails valid, others will consider them spam. Because of this, no filtering solution can truly provide a useful and meaningful level of end-user satisfaction unless it can accommodate, on an individual level, which users consider what types of commercial mail to be spam and which users actually prefer to receive those kinds of messages.

While some systems attempt to provide end-user control by simply providing “Whitelists” and “Blacklists” in the real world these prove to be nothing more than error-prone band-aids because they cannot distinguish between legitimate emails from Support@Ebay.com, for example, and those from spammers that simply “spoof” the same address.

The ability of a spam solution to make decisions based on end-user preferences, with significant focus on not only blocking spam but on preventing false-positives, and to do so with minimal end-user effort, is what makes spam filtering such a complicated business.

Related posts:

  1. Viagra Email Sets New Records
  2. Image Spam You Have To Work For
  3. Can Anyone Send Emails Claiming To Be From Me?
  4. New Phishing Email Accuses You Of Sending Spam

Tags: , , ,

Feel free to respond but please read our Comment Policy first.