What’s An Email Address Collector?

Spam is a volume business. The spammer that sends the most spam to the most addresses wins. Therefore, spammers need to continually find lots of new email addresses.

An email address collector (a.k.a. email address extractor, harvester or scraper) is a software tool used by spammers to crawl the web looking for email addresses.

How Does It Work?

Email address harvesters are more or less the same as the web crawlers used by search engines to index the web. Basically, crawler software starts with a given web page and visits every page linked from that page and every page linked from each of those pages and so on until it is stopped or it runs out of links.

In the case of a search engine crawler the software also records various pieces of data about each page such as word frequencies, what links it contains and how old it is. An address scraper is only interested in email addresses. It searches each page for character strings containing ‘@’ and ‘.’ (and if it’s really smart ‘at’ and ‘dot’). When it finds these two characters in the right order (and possibly other criteria are met) it saves them to the spammer’s database.

Evading Email Address Collectors

The surest way to avoid having your email address harvested is to make sure your address never gets on any web pages. Sounds easy enough but it can be tough in practice.

Your address can be exposed by:

  1. Profiles and/or signatures for public forums, blog comments, social networking sites and any sites that display profiles.
  2. Contact info on web sites for work,  church groups, clubs, schools and other organizations.
  3. Your own web site(s) (personal and/or business).

Each of these requires a different response.


Sites that include email addresses in profile information offer the least control. In the best cases the site will allow you to opt out of email address display. However, sometimes you want the address to show so people can contact you.

If the site doesn’t check the address for validity you can use obfuscation like the following examples:

  • myaddress @ mydomain . com — Just adding spaces will confound a lot of address collectors and probably a lot of humans too because it looks pretty normal.
  • myaddress AT mydomain DOT com — Most humans will figure this out, few if any email extractors will.
  • myaddress replace with @ mydomain replace with . com — This is more obscure and will catch even more harvesters; probably a few humans too.

If you are required to supply a real email address you can either not use the service or use a throwaway address. If you need people to use it to reach you, you’ll either have to pick through the spam or get a good spam filtering service.


Avoiding having your address published by organizations is kind of tricky. You have to be aware of how their web presence is managed, who does it and policies regarding contact information privacy. You never know when the club secretary might decide that everybody’s contact info should be on the web site.

You can also use throw away addresses for this.

Your Own Site(s)

This is where you have the most control. Email addresses can be obfuscated as described above or by using a tool such as the OnlyMyEmail Encoder. For this to work you need to be able to edit your web site or have someone competent to do it for you.

You can also use contact forms that allow communication from your web site via CGI scripts. These scripts can be vulnerable to bots that distribute spam by filling out web forms, so you need to either write or find a good one.

