Friday, June 13, 2008

Greylisting Whitelist for Gmail via SPF

[update: 10/14/2009]
I just noticed that Google provides this same technique on a Google Apps help page. Here is the link:

http://www.google.com/support/a/bin/answer.py?hl=en&answer=60764
--------

In the ever escalating arms race between spammers and hosting providers, the technique of greylisting was, at one a point, an extremely effective weapon against spam. These days, greylisting is not as effective, because spamming hosts now correctly handle the temporary error condition used by a greylisting server. But it still is a good tool and first line of defense when used correctly.

Of course, one of the biggest problems with greylisting comes from large email hosting providers, like Gmail, Yahoo, Hotmail, etc. These services use multiple servers to deliver their mail (they have to), and as such, when a message is returned as a temporary failure, it's not guaranteed that the same sending server will attempt to deliver the message a second time.

As such, the only thing to do is whitelist the servers used by these large hosting providers. Due to, I'm sure, Google's ever growing infrastructure, Gmail is particularly difficult to keep up on which servers belong to the Gmail cloud.

I just stumbled upon a pretty decent way, however, to easily learn all of Gmail's outgoing mail server IP addresses. This information is readily available in their SPF records for their service.

Here is a link which describes which type of SPF record should be added to a domain which has email routed through Google Apps: http://www.google.com/support/a/bin/answer.py?hl=en&answer=33786

You can see that Google specifies an "include" directive which is aspmx.googlemail.com. So, now we can do a TXT query on this domain, which gives us:

]$ host -t txt aspmx.googlemail.com
aspmx.googlemail.com descriptive text "v=spf1 redirect=_spf.google.com"

The same query against googlemail.com and gmail.com returns the same information, too. So, following the redirect:

]$ host -t txt _spf.google.com
_spf.google.com descriptive text "v=spf1 ip4:216.239.32.0/19 ip4:64.233.160.0/19 ip4:66.249.80.0/20 ip4:72.14.192.0/18 ip4:209.85.128.0/17 ip4:66.102.0.0/20 ip4:74.125.0.0/16 ?all"

Obviously we can infer from this the following ip address ranges:

216.239.32.0/19
64.233.160.0/19
66.249.80.0/20
72.14.192.0/18
209.85.128.0/17
66.102.0.0/20
74.125.0.0/16

I don't know if Google has officially published all their mail servers to help us who use greylisting. However, the above technique seemed to be a pretty slick way to determine which IP address ranges should be considered for whitelisting Gmail.

Well anyway, it was one of those "oh duh" type moments for me. Hopefully this helps someone else too.