Frequently Asked Questions

General

DNS

SpamAssassin

SURBLs compared to other lists

General

How do I request removal of a site from a SURBL list?

To request removal from a SURBL list, please start with the the SURBL Lookup page and follow the instructions on the removal form.

How can I keep up with important changes or updates to SURBLs?

To keep up with important updates, please subscribe to the SURBL Announcement list. This is a low-volume, announcement-only list. It is important that you receive announcements such as planned changes to the rsync directory structure, zone data file layout, etc.

What testpoints are in the SURBL data?

Previously, each SURBL zone file, including the bitmask-combined list multi.surbl.org, had testpoints with A records resolving to 127.0.0.2. As of about November 2005 multi has testpoints corresponding to all ones set for each bitmask position. So testpoints for multi.surbl.org now have an A record value of 127.0.0.126 instead of 127.0.0.2 . (If a new list is added in future using the 128th bit, then the new testpoint values in multi will be 127.0.0.254 . Following tradition, the first bit is not used.)

This may work better with applications that decode the bits into individual list results, but it is a change from before and it may break other applications' use of the testpoints.

The multi.surbl.org BIND zone file contains:

test.surbl.org	604800	IN	A	127.0.0.126
	604800	IN	TXT	"multi.surbl.org permanent test point"
test.multi.surbl.org	604800	IN	A	127.0.0.126
	604800	IN	TXT	"multi.surbl.org permanent test point"
surbl-org-permanent-test-point.com	604800	IN	A	127.0.0.126
	604800	IN	TXT	"multi.surbl.org permanent test point"
2.0.0.127	604800	IN	A	127.0.0.126
	604800	IN	TXT	"multi.surbl.org permanent test point"

The multi.surbl.org rbldnsd zone file contains:

test.surbl.org	:126:multi.surbl.org permanent test point
test.multi.surbl.org	:126:multi.surbl.org permanent test point
surbl-org-permanent-test-point.com	:126:multi.surbl.org permanent test point
2.0.0.127	:126:multi.surbl.org permanent test point

Those resolve into the following Address records:

Name:     test.surbl.org.multi.surbl.org
  Address:  127.0.0.126

  Name:     test.multi.surbl.org.multi.surbl.org
  Address:  127.0.0.126

  Name:     surbl-org-permanent-test-point.com.multi.surbl.org
  Address:  127.0.0.126

  Name:     2.0.0.127.multi.surbl.org
  Address:  127.0.0.126

But note that only the last, two-level domain surbl-org-permanent-test-point.com will work as the base domain for a URI in a test message for SpamAssassin. This is because URIs with test.multi.surbl.org.multi.surbl.org, etc., won't be detected by most SURBL-using programs because they're supposed to be reduced down to a two-level domain which would be surbl.org for those.

What URIs should a SURBL test message have?

SURBL test URLs are:

http://surbl-org-permanent-test-point-MUNGED.com/

or:

http://127.0.0.2-MUNGED/

without the "-MUNGED"s. So if you send yourself a message with any of those unmunged testpoints as URIs, the messages should match any SURBLs you have installed.

How can I send or receive messages mentioning blacklisted sites?

If blacklisted URIs must be mentioned in a message body, then one answer is to munge the URI until it's no longer parsable as a URI. E.g.,

http://somedomain.com/

can be rewritten as:

http://somedomain-MUNGED.com/

That would require some awareness on the part of the person forwarding or discussing a listed site, but it's just as doable and humanly readable as munged email addresses, which people do all the time.

Another commonly used technique is to change the "http" to something that doesn't work such as "hxxp".

It's a good practice to use little or no filtering on your security mailing list messages and abuse contact addresses, or to bypass them around filtering.

How can I report sites for possible inclusion or removal from SURBLs?

Different SURBL lists use different data sources. SC and AB use SpamCop reports as their main input. JP and OB use private traps as input. WS is mostly a manual, hand-built list. For more information about reporting please see the Lists page.

How are redirection sites handled?

Redirection sites like drs.yahoo.com, tinyurl.com, etc. take external URIs (unfortunately including those mentioned in unsolicited messages) and redirect a browser to them. Therefore redirectors can be abused if a simple web site check only looks at the initial web site, making the whole web site appear legitimate. For example the Yahoo redirection below might be (incorrectly) parsed as a legitimate yahoo domain:

http://drs.yahoo.com/covey/parr/*http://other.address/

SpamCop itself seems to disambiguate (most of) the redirection. If someone is using a redirector to send traffic to somedomain.com, SpamCop seems to detect and resolve it correctly to somedomain.com most of the time. So the data that's used as input to sc.surbl.org already has redirectors correctly handled to some extent. In other words, we're protected on the data input side by the processing that happens at SpamCop to take out the redirection in reported URIs.

SpamAssassin programs such as SpamCopURI and urirhdbl that use SURBLs are capable of handling redirections to differing degrees. SpamCopURI 0.14 uses LWP to get Location information to untangle up to four levels of redirection sites without actually visiting the sites. URIDNSBL's urirhsbl includes patterns to extract the final domains from some redirection URIs. Further development will probably improve the handling of redirection sites.

The big picture solution is for the redirection sites to block abusive sites on their own. In other words, they should not let abusers redirect through their sites. Some redirection sites, such as tinyurl.com, reportedly actively block and report abusers of their site. Others such as Metamark and SnipURL are using SURBLs to deny abusers access to their redirection services. Here is an Open Letter to Redirection Sites that may be used or modified to contact them.

How are randomized subdomains or host names handled?

The randomized subdomain problem is solved by extracting the base domain on both the SURBL data and message-checking client sides then comparing those base domains. In this way any random stuff added to the base domain is ignored. (The base domain is what would be registered with a name registrar.)

We've seen quite a few randomized or customized (to a username for example) host names in some of the top pill sites. There are different possible reasons for the randomization: to add chaos to the names to throw off message body checkers, or perhaps to "key" pill site web visits to specific mailings in order to build a confirmed mailing list. (Such confirmed mailing lists themselves are probably a valuable commodity to sell to other senders.) Randomization doesn't throw us off though; we catch them from the base domain part, which can't change.

How does SURBL prevent Joe Jobs and other false positives?

The averaging effect of a large SpamCop reporting base seems to be very strong, and very few false positives (FPs) seem to get into sc.surbl.org. The fact that the manual SpamCop reports can be and probably are mostly hand-tuned by every SC user seems to help prevent false positives. I.e., most SC users probably make an effort to uncheck legitimate domains to prevent false reporting.

Certainly the existing SURBL whitelist could be used to prevent Joe Jobs (false reporting or detection of legitimate domains). We've already added some of the common domains like yahoo, hotmail, ebay and amazon, etc. These seldom appear above the threshold yet, however, so the law of averages and careful reporting seem to be on our side so far.

(Note that the above comments apply to the handling of SpamCop URI data that goes into sc.surbl.org. However the gloabl whitelist applies to all SURBLs, including sc. Once a domain or IP address is whitelisted, it's excluded from all SURBLs.)

Update: Our whitelist, which we use only to exclude domains from SURBLs, not to "allow" messages, is growing but doesn't hit data from the various SURBLs too often. The goal is to keep whitehats off the lists in the first place by being careful with the input data. The whitelists are intended to be a safety backstop to make sure domains with legitimate uses don't get added.

com.br, co.uk, etc., are in the SURBL whitelist. Does that mean subdomains under those won't get listed?

No. We list country code TLDs (ccTLDs) like co.uk in our whitelist as a kludge to prevent those specific types of country code two-level domains from ever getting listed. In other words, since they are on whitelists, "co.uk" and "com.br" by themselves will never appear on our lists. But somedomain.co.uk and anotherdomain.com.br will still get added to the lists just fine.

If an unsolicited message contains a site on SURBL's whitelist, does that mean it won't be detected?

No. Our whitelist is used internally to prevent certain domains and IP addresses from getting onto the lists. In that sense our whitelists are more like internal "ignore" or exclusion lists. The whitelist entries have no direct effect on the checking of messages. If a message does include some whitelisted domains, that essentially has no effect on detection using SURBLs.

How can I locally whitelist some domains or IPs to prevent SURBL checking of them?

If you are using SpamAssassin or SpamCopURI, they have built-in support for local whitelists which will prevent those message body URI hosts from being checked:

URIDNBL rules: (Please see the actual rules for the standard list of domains to exclude from checking.)

# Top 125 domains whitelisted by SURBL
uridnsbl_skip_domain yahoo.com w3.org msn.com com.com yimg.com
uridnsbl_skip_domain hotmail.com doubleclick.net flowgo.com ebaystatic.com aol.com
[...]

SpamCopURI also has a built-in whitelist function:

whitelist_spamcop_uri   *.yahoo.com

Other SURBL applications may have similar exclusion features. If not, their authors may want to consider adding local whitelisting.

Notes:

  1. These local whitelists prevent those specific domains from being checked. They do not provide a negative score or bypass the message around other testing, including testing of any other URIs that happen to be in the message. So if a message has a URI for yahoo.com and hugepills.com, hugepills.com will still get checked, even though yahoo.com won't be checked.

     

  2. These local whitelist entries should only apply to domains and IPs (hosts) that appear in message body URIs. They should not apply to mail server names, sender domains, message headers, etc. Therefore there's no need to add your mail server names, etc. to these whitelists. Remember that SURBLs are meant to opreate on message body URIs.

     

  3. If you know of a legitimate site that should be globally whitelisted but isn't currently, you're welcome to report it to whitelist at surbl dot org for consideration. In other words some of your local whitelist entries may be appropriate for global whitelisting. If you know them to be legitimate for certain, then other SURBL users may benefit from the knowledge you choose to share with the community.

     

  4. To request removal from a SURBL list, please start with the the SURBL Lookup page and follow the instructions on the removal form.

Are there plans to offer an list with the SURBL domain names resolved into IP addresses?

The quick answer is probably not. One of the benefits of SURBLs is the lack of need to do forward name resolution, where in large volume mail systems the associated lookups, timeouts, etc. can incur too large a delay to be practical.

There are also some security or privacy concerns about resolving a keyed domain name, since that could give out information about the success of a unsolicited messages, for example if the recipient is keyed in the full domain name as in:

http://resolving-this-confirms-specific-recipient.somedomain.com/

The concern here is that the act of performing the resolution itself could be used as a confirmation of a delivery attempt given a URI customized (keyed) to a specific recipient. Such a confirmation could be used to build additional recipient lists, even if it just helps narrow down messages (and therefore recipients) which made it through the gauntlet of other filtering methods. In other words name resolution can potentially provide useful information to the senders.

Name resolution also adds a significant performance penalty, especially on high volume mail servers. For domains that don't resolve, a timeout of tens of seconds can result. These kinds of delays can make resolution of URI message body domains impractical for busy mail servers.

That said, if an IP address does appear in an unsolicited message's web site, then it can appear in SURBLs as an IP address. The principle is to accurately record what's in message body web sites. If there's a domain name in the message then that name can get onto a SURBL. If there's a number it too can get onto the SURBL.

Creating a list of the resolved addresses is something we considered. Doing so would be too similar to existing number-based approaches such as using the sbl.spamhaus.org list with the SpamAssassin command uridnsbl, of which SURBL-using urirhsbl is its domain-based twin. Another way to use number-based lists to check message body domains resolved into numbers is the sendmail milter which the SpamHaus site mentions: http://www.five-ten-sg.com/dnsbl.html .

However the current version of the sc.surbl.org data engine is a hybrid name and number approach, where if a domain resolves into an IP address commonly used with advertised sites, then that domain will get added to sc.surbl.org probably with the first report. (Note that this still requires at least one report, but the threshold for inclusion will be radically lower for major operators who repeatedly use the same IP address for their hosting.) This hybrid approach moves sc.surbl.org much closer towards the behavior of a number-based approach, though domains will still need that initial report, whereas a numbered list would catch the entire server by its IP address.

Of course a downside of using numbers is that they can false positive any legitimate domains that happen to be hosted on the same IP address as a blacklisted site. That could be disasterous for a large web hosting company that had one bad apple. That's another major reason why we went with names and not numbers. Numbers can be overly broad, whereas names are highly specific to the advertised site. To us names are a finer tool: if 30% of the domains on a given IP address are used by senders of unsolicited messages, then we could list all of them and not affect the 70% other domains that unfortunately happen to share the same IP address. That specificity is a strong benefit of using domain names.

Is there still code to write for the SURBL movement?

Eric Kolve has made his SpamCopURI SpamAssassin 2.63 and 2.64 plugin work with the SURBL lists. It works with those versions of SpamAssassin.

The SpamAssassin 3 plugin URIDNSBL now has support for SURBL using its urirhsbl and urirhssub commands (see the SpamAssassin section of the FAQ), so that release of SA is covered.

Devin Carraway has written a plugin for the Perl-based MTA qpsmtpd to compare domains from message body URIs to SURBL domain lists. Here's his announcement on perl.qpsmtpd, and a link to his uribl plugin. Devin's was the first MTA implementation.

Exim, sendmail, postfix, qmail, qmail-ldap and Exchange programs and plugins have been written to support SURBLs in those MTA. (Please see the Links and News pages for more information.) Support of SURBL directly in other MTAs would also be useful. MTA support can reject messages directly back to the sender. MTA support has the added advantage of pre-empting other, more expensive processing of messages, for example in SpamAssassin. There is still code to write for any mail filter or MTA that doesn't support SURBLs yet.

DNS

I run a high volume mail server. How should I use SURBLs?

For systems processing more than 250,000 messages per day, please set up a local caching name server for any of the lists you are using, including SURBLs. This is considered a standard, good practice since it offloads the public name servers and improves your performance with local lookups.

A very popular and fast name server specifically meant for serving up lists is rbldnsd. SURBL zone files are available in rbldnsd format. (They are also available in BIND format, though rbldnsd is recommended as significantly leaner and faster.) There are links and instructions for using rbldnsd with rsync in the Links section.

Then arrange with the lists to get rsync access to their zone files. Since rsync only transmits differences, the zone files are kept updated in a very efficient manner. To get rsync access to SURBL zone files, please fill out our data feed access form. Other lists may have similar procedures for gaining rsync access.

Then configure your mail or SpamAssassin servers using lists to do the lookups on your local list name server. Many people run the local DNS for their lists on their mail server(s), which tends to work well since it keeps everything on the same box. If your mail server is separate from your list name server, then set up DNS on the mail server to resolve using that list name server. The following documents may be helpful in setting up local caching:

I'm using an anti-spam or anti-phishing DNS proxy, and I'm seeing legitimate sites marked as unsolicited.

There are some DNS proxy or modification services that change the responses from certain DNS queries in order to prevent users from visiting sites advertised in phishing, unsolicited messages, etc. This can cause errors when using SURBLs if the proxies return an IP address of an alternative (safe) web site. The modified IP address can have an incorrect effect on SURBL list identification depending on where the bit patterns happen to be in the modified response. The result is that legitimate sites may be misidentified, but the effect appears to be somewhat random or arbitrary.

A solution is to disable such site correction or modification features on servers or clients doing SURBL queries. Alternatively, consider using regular (non-modifying) nameservers for those systems. Often the best solution is to set up a local caching nameserver.

Note also that SURBL applications may be incompatible with DNS modification or proxy services that change the DNS query results of non-matches (NXDOMAIN results) for non-existent sites.

Note that as of 1/25/07, OpenDNS no longer modifies results for SURBL lists. It should now be safe to use OpenDNS with SURBL applications. If you find you are behind a firewall or proxy that is modifying SURBL DNS queries incorrectly, one solution is to set up a local caching nameserver. A local caching server can significantly improve performance also.

 

I'm using my provider's nameservers, and I'm seeing legitimate sites marked as unsolicited.

Some ISPs such as Verizon and Charter are reportedly modifying some DNS NXDOMAIN responses in a way that causes what look like false positives on domains that are not blacklisted. Unfortunately this breaks DNS responses for SURBLs and other blacklists. Please check with your ISP if you are seeing DNS responses modified in this way. Verizon has an opt-out procedure with instructions on switching to DNS servers that do not change NXDOMAIN responses. Others such as Charter have opt-out nameservers that reportedly do not support NXDOMAIN, in which case none of their nameservers may be compatible with DNS blacklists. One solution is to not use your provider's nameservers, for example by setting up your own local caching nameserver instead. Most operating systems have built-in support for running your own nameservers, and a local nameserver can significantly improve performance.

Is there a performance advantage from using name to name comparisons instead of name to IP address checks?

It's definitely quicker to do DNS lookups of the single, cached SURBL query than DNS lookups for Address and/or Name Server records for every web site in every incoming message. By comparing names to names in incoming URIs and SURBLs, we avoid a major performance bottleneck of URI checks that try to resolve wild domain names, whose name servers you have no control over, into IP addresses or NS records. In other words using a SURBL is much quicker.

SpamAssassin

How do I use SURBLs with SpamAssassin?

SpamAssassin 3 and later versions suport SURBLs by default. See the next section for more information about enabing network tests which include SURBL checks. If you're using a version of SpamAssassin older than 3.0, then you should upgrade to the latest version to get some security patches and better performance.

For SpamAssassin 2.63 and 2.64, SpamCopURI is a program that can be used with SURBLs.

Important Note: Matt Kettler says: DO NOT run SA 2.63 on a production server. Upgrade to 2.64 or 3.x because 2.63 has a MIME parsing bug that can be used to DoS your server.

For SpamAssassin 3.X, there is a suite of programs contained in the plugin URIDNSBL including some that can be used with SURBLs and some that can't:

  • urirhsbl is a program that checks message body URIs against individual SURBLs, i.e., it compares (mostly) domain names found in message bodies against (mostly) domain names in one SURBL per rule.

     

  • urirhssub is a program that checks message body URIs against the combined SURBL multi.surbl.org, i.e., it compares (mostly) domain names found in message bodies against (mostly) domain names in multiple SURBLs.

     

  • uridnsbl is the SpamAssassin 3.X program that checks message body URIs against numeric lists, by resolving the URI domain names' NS records, then checking those name server IP addresses against the list. It cannot be used with SURBLs, but can be used with lists of blacklisted name server IPs such as sbl.spamhaus.org.

 

For more information, please see SpamAssassin documentation and community.

My SpamAssassin now has SURBL support, but I'm not seeing SURBL rules hit. (Enabing network tests)

If you recently added SURBL support to SpamAssassin but are not seeing any SURBL (or other list hits), check that you have enabled network tests.

For example, if you're using spamd, make sure it's started without the -L or --local flags, which force local tests only.

If you are running Amavis, make sure amavisd.conf has $sa_local_tests_only = 0. (Uncomment this line if it was commented out before, then set the value to zero to enable network tests.)

If you are using MIMEDefang, make sure you set $SALocalTestsOnly to zero:

# If boolean true, skip SA network tests
     $SALocalTestsOnly = 0;

to enable SpamAssassin network tests from your mimedefang-filter.

Also make sure that you have a recent Net::DNS installed. Too old versions of Net::DNS seems to be a common reason for RBL checks not working, especially when upgrading from an older version of SpamAssassin.

After upgrading from SpamAssassin 3.0.0 to 3.0.1 or later, non-default URIDNSBL checks including SURBL no longer work

When upgrading from SpamAssassin 3.0.0 to 3.0.1 or later, please change rules using urirhsbl, urirhssub or uridnsbl, from rule type header to rule type body. For example:

	urirhssub URIBL_JP_SURBL multi.surbl.org.   A   64
body      URIBL_JP_SURBL eval:check_uridnsbl('URIBL_JP_SURBL')
describe  URIBL_JP_SURBL Contains a URL listed in the JP SURBL list
tflags    URIBL_JP_SURBL net
score     URIBL_JP_SURBL 3.0

Where body above was previously header. Here is the changelog reference:

r54022 | felicity | 2004-10-07 22:21:30 +0000 (Thu, 07 Oct 2004) | 1 line

bug 3734: uridnsbl rules work on body data, not header data, so change
the rule type from header to body

SURBLs compared to other lists

Can SURBLs be used like other lists?

Generally speaking SURBLs cannot be used like other lists. SURBLs generally should not be used in places where other lists are used. Most programs that use conventional RBLs for processing mail header information such as sender IP address or sender domain will not work well with the SURBL data.

SURBLs need to be used with programs that can extract domains from message body URIs, which is different from traditional list usage.

How are SURBLs different from other lists?

SURBLs list web sites found in unsolicited message bodies. Those domains can be used to detect future unsolicited messages advertising the same sites.

In contrast, most other lists have the IP addresses or domain names of unsolicited message senders, open relays, open proxies, etc. Programs that use these traditional lists generally check those list entries against connections or message headers. They generally do not check message bodies, i.e., the content of messages. Most prior efforts to look at message body URIs have converted the sites they find into IP addresses for comparison with a numeric list.

We feel that SURBLs directly address the web sites seen in unsolicited messages. We also feel that there is a significant performance advantage in comparing SURBL sites to message body sites, since there is no delay needed to resolve domain names into numbers.

How are SURBLs different from SpamHaus' SBL?

SpamHaus' SBL contains IP addresses of blacklisted web servers, name servers, and mail senders. SBL can be used to check message senders and message bodies, but using SBL to check message bodies requires resolving the domain names found there into IP addresses.

However the act of resolving the domain can confirm for senders of unsolicited messages that your specific address was reachable, that you've opened their message, etc. And the name resolution can significantly delay the amount of time it takes to process each message. (The delay is on the order of a few seconds per message; not a big deal to an end user, but a major bottleneck to the servers which typically need to handle thousands of messages quickly.)

In contrast, SURBLs contain mostly domain names that have appeared in message body URIs. Typically those are the web sites that are being advertised. Using SURBLs doesn't require resolving the domains that appear in messages. That's safer, more private and much faster.

How are SURBLs different from SpamCop's Blocking List?

SpamCop's Blocking List (SCBL) contains IP addresses of senders that have been reported to SpamCop. It does not contain information about web sites or the content of message bodies. The SpamCop BL is often used in mail servers to detect those senders when they try to connect to the server to deliver their messages.

While some SURBL lists such as SC and AB use data from SpamCop, it's the Spamvertised sites that they use and not the sender IP addresses. Those are the web sites advertised in the message bodies which have been reported to SpamCop.

The difference is between detecting senders based on message headers, as the SpamCop Block List is commonly used for, versus detecting based on URIs advertised in message bodies, which is what SURBLs are used for. This is useful because senders of unsolicited messages frequently shift the IP addresses they send from, but they tend to advertise the same sites repeatedly.


faq.html version 3.06 on 4/3/10

SURBL Data Feed Request

SURBL Data Feeds offer higher performance for professional users through faster updates and resulting fresher data. Freshness matters since the threat behavior is often highly dynamic, so Data Feed users can expect higher detection rates and lower false negatives.

Data feeds are available in three formats:

Rsync and DNS are typically used for mail filtering and RPZ for web filtering. High-volume systems and non-filter uses such as security research should use rsync.

For more information, please contact your SURBL reseller or see the references in Links.

Sign up for SURBL Data Feed Access.

  • Sign up for data feed access

    Direct data feed access offers better filtering performance with fresher data than is available on the public mirrors. Sign up for SURBL Data Feed Access.

  • Applications supporting SURBL

  • Learn about SURBL lists