Usage

Because SURBLs are not conventional RBLs you won't be able to use them directly in MTAs like Sendmail or postfix or in programs like SpamAssassin or procmail to block spam sources in the same way as other RBLs. What's needed is code that can examine the message bodies, extract any URI domains contained in them, and compare those against an SURBL. In order to meet this need, Eric Kolve has kindly updated his SpamAssassin 2.63 and 2.64 plugin, SpamCopURI, adding a method to do lookups on SURBL instead of local database lookups on its independently cached copy of SpamCop data. Eric's revised SpamCopURI lets SA compare relatively raw Spam URI domains against SURBL, which is exactly what we need. Message-body-aware MTAs such as postfix and mail filtering programs such as procmail can probably also be modified to use SURBL on message body URIs also. For example, Devin Carraway has written the first MTA use of SURBL we've heard of as the plugin uribl for qpsmtpd.

The SpamAssassin 3 plug-in URIDNSBL, also known as "URIBL," has been updated to work with name RBLs though the new command urirhssub. Using urirhssub with SURBL means SA 3 users can now compare domain names in the message bodies against domain names in SURBL. Preliminary results reported by Daniel Quinlan on the SA Developers list are very strong with a 60% spam hit rate and near-zero false positives. These results are shown in the News section. Since urirhssub does not require any name resolution of the URI domain itself, it should also be significantly faster than other methods in URIBL.

The SpamCop-URI-derived SURBL can be found in sc.surbl.org. It includes both domain name an reversed-numeric addresses of SpamCop-reported spam sites in the standard formats: spamdomain.com.sc.surbl.org for name-based references, and 4.3.2.1.sc.surbl.org for numbered references. Matching references return an Address (A) Resource Record of 127.0.0.2 and a Text (TXT) Resource Record currently reading: "Message body contains SpamCop spamvertised domain." In other words, it looks like a typical RBL.


Raymond Dijkxhoorn has kindly set up an rsync server for the SURBL rbldnsd and BIND zone files. Please fill out our rsync request form for access, particularly if you run a high volume mail server, and see the News section for more information. Please do not get the zone files for production use from this web site.
One current SURBL difference compared with the format of other RBLs is that we combine names and numbers into our single RBL. Given how few numbered URIs percolate to the top of the SpamCop reports, this may be appropriate, assuming programs using RBLs don't mind too much. We could break out the named and numbered references into separate lists, but the numbered list would be really small. Lumping them together into a single list fits the source data better, but may not fit the spirit of (other) RBLs as neatly.

By "fitting the source data," we mean the data in SURBL matches the references actually found in spam, e.g., if http://10.20.30.40/ is reported often enough in spams, it gets into SURBL as 40.30.20.10 . This is not at all related to spam mail server names being DNS resolved into numbers so they can be checked against a numeric-ip-address-based conventional RBL. If there are numeric addresses in SURBL, it's because they actually occurred in a URI in reported spam message bodies. Similarly, if the reference http://massivespammer.com/ gets reported often enough in spams, it will get into SURBL as massivespammer.com.... This is part of the rationale for treating numeric and named references identically: it accurately reflects the source data. Our data comes from URIs reported to be in spam bodies, and sc.surbl.org reflects that accurately.
It's important to note that SURBL should be used with the basic, "registrar" domain, i.e., the domain name that would be registered at a registrar. While the current version does count frequently occurring spam subdomains (e.g., subdomain.spamdomain.com), future versions of SURBL will probably only have the basic domain (e.g., spamdomain.com). Therefore applications using SURBL should remove all but the basic domain before trying to match them. Due to the way the current data is counted, the basic domain will work just as well or better than one with a subdomain. Therefore for future compatibility SURBL users should start using only the base domain.

<< Introduction Previous Section, Next Section Data >>

usage.html version 1.69 on 7/15/07