Here are some brief guidelines for applications using SURBLs.
- Extract URIs from message bodies. (Extraction of URIs from message bodies should ideally include full resolution of redirections into the final target domain name. This can be a non-trivial problem.)
Extract base (registered) domains from those URIs. This includes removing all leading host names, subdomains, www., randomized subdomains, etc. In order to determine the level of domain to check, use our tables of two level and three level TLDs. Originally these were CCTLDs, but they now also include some frequently abused hosting domains. (Note that these files only rarely update. Please don't retrieve them more often than once per day.) The usage is:
- For any domain on the three level list, check it at the fourth level.
- For any domain on the two level list, check it at the third level.
- For any other domain, check it at the second level.
- Perform NO name resolution on the extracted domains.
Look up the domain name in the SURBL by prepending it to the name of the SURBL, e.g., domainundertest.com.multi.surbl.org, then doing Address (A) record DNS resolution on the resulting combined name. A non-result (NXDOMAIN) indicates lack of inclusion in the list. An Address result indicates list inclusion. (Individual lists return an A record of 127.0.0.2, but their use is deprecated in favor of multi.surbl.org as described in the note below.) SURBL matches also have a TXT record associated with them containing a descriptive reason for list inclusion, but the A record is the strongly preferred response for automated use.
Using the combined list multi.surbl.org, results will be bitmasked as described in the Lists section. In this case, membership in multiple lists is encoded according to respective bit positions in the returned Address value, and programs should decode these results into their respective individual lists.
- Handle numeric IPs in URIs similarly, but reverse the octet ordering before comparison against the DNSBL. This is a standard practice for DNSBLs. For example, http://10.20.30.40/ is checked as 18.104.22.168.multi.surbl.org. Numeric addresses should be in base-10 representation. If other representations of numeric IP addresses appear in messages, then they should be converted into four, reversed-order, dotted, base-10 octets before checking.
- Include a local whitelist function to exclude certain known whitehat domains or IPs from SURBL checking. This is very important since it prevents many unnecessary queries against common domains like yahoo.com, w3.org, google.com, etc., which will never be blacklisted. It's described further in the FAQ.
- An increasing number of DNS services are being deployed to protect end users against typo, malware, phishing and unsolicited message web sites by redirecting web service to an alternate site. Generally they do this by changing the IP address or NXDOMAIN responses to DNS queries. Unfortunately these changes can adversely affect SURBL DNS responses and create false positives or false negatives. One way to test for this is to make sure that responses to SURBL DNS queries are in 127/8, in other words that 127 is its first octet. While this won't fully determine correct results, it's still a recommended and good basic input verification test. A good administrative solution is to run a local caching nameserver for DNS resolution, which can also improve performance. More information is in our FAQ.
SURBL lists somewhat unusually have both names and numbers in the same list. For example, 22.214.171.124 and test.surbl.org and similar actual unsolicited message URI domains and IP addresses both appear in SURBL lists. IP addresses appearing in SURBLs are not the result of applying name resolution to domain names. Numbered addresses in SURBLs have actually occurred in unsolicited message URIs as numbers, e.g.: literally http://10.20.30.40/. Additional SURBL test points are mentioned in the FAQ. For more information about list data please also see the Lists section.
Please do not use SURBLs to check sender IP addresses. Also do not resolve domains on SURBL lists and make blacklists of the resulting IPs. These may result in unexpected false positives, for example for services hosted on shared IP web or mail servers.
implementation.html version 1.81 12/9/11
SURBL Data Feed Request
SURBL Data Feeds offer higher performance for professional users through faster updates and resulting fresher data. Freshness matters since the threat behavior is often highly dynamic, so Data Feed users can expect higher detection rates and lower false negatives.
The main data set is available in different formats:
Rsync and DNS are typically used for mail filtering and RPZ for web filtering. High-volume systems and non-filter uses such as security research should use rsync.
For more information, please contact your SURBL reseller or see the references in Links.
Sign up for SURBL Data Feed Access.