Setting up a SURBL name server

Here are some instructions for setting up a SURBL name server. The information here applies to both private and public DNS mirrors with some minor differences between them. The differences are discussed in the text.

Background

Like other DNSBLs, SURBL data are provided as DNS zone files. They are most commonly served as DNS zones, both for private and public use. Applications typically access the data using DNS queries.
Private nameservers are for local, internal use by an organization.

SURBL public nameservers are generously run by a network of volunteers at ISPs, universities, Internet security companies, and other organizations for the benefit of the Internet community. Servers are generally located in large, well-connected datacenters. SURBL data are made available via free, public DNS queries for small- to medium-sized organizations of fewer than 1,000 users.

Larger organizations or commercial applications should sign up for data feed access and mirror the zone file locally as described here.

Requirements

  1. Network sizing: Current public name server traffic is about 300k bits per second per server, but it could increase to as much as a megabit per second. Therefore public name servers should be hosted on networks that can comfortably accomodate up to a megabit per second of traffic. For private nameservers this is generally not an issue since the traffic would be carried on an internal LAN, intranet, VPN, etc.

    The multi.surbl.org zone file is about 100 megabytes in size and may take up to an hour to rsync the first time. Later rsyncs complete within a minute and use a few tens to hundreds of kbytes to transfer file differences.
  2. Software: Use the widely-used, open-source programs rbldnsd and rsync on a fast, reliable operating system such as FreeBSD or Linux to get and serve the zone files. There is a Windows port of both programs called wrbldnsd, but running on UNIX-like operating systems has a number of advantages. If using wrbldnsd, it is not necessary to set up an ssh keypair.
  3. Server: Since rbldnsd is so efficient, the hardware requirements are minimal. Any 1 GHz Pentium 4 or later processor and 512 MB of memory is probably enough if the server is dedicated to rbldnsd and runs Linux or BSD. rbldnsd serves everything from memory after updating, using about 200 MB of memory when a new SURBL zone file is loading, and about 100 MB when there's no new SURBL file loading. If serving up several lists in addition to SURBLs, then a dedicated server with more memory may be a good idea. Please chose reliable servers. It's important that the services are reliable in order to have responsive and correct mail filtering.
  4. Access control: For public nameservers only, include the access control list in the rbldnsd configuration as described below. An ACL is necessary to be able to prevent abuse of the servers.

Deployment

  1. Set up rbldnsd and rsync, then request Data Feed rsync access.
  2. Here is a sample rbldnsd startup for a public nameserver:
    		/usr/local/sbin/rbldnsd 
                    -u rbldns:rbldns 
                    -r /usr/local/etc/rbldnsd 
                    -b 10.10.10.10/53 
                    -a 
                    multi.surbl.org:dnset:multi.surbl.org.rbldnsd 
                    multi.surbl.org:acl:surbl.org.acl
    

    (Replace 10.10.10.10 with your nameserver IP address.)
    multi.surbl.org is the only zone that should be served or used by applications.

    The -a option allows rbldnsd to not serve authority information, which is appropriate since the servers are already known as authoritative due to the delegations from the parent zone. As a result of -a, the size of replies will be decreased dramatically, significantly reducing network traffic by eliminating this large amount of unnecessary information.

    -c and -t options are strictly-speaking not necessary since the checking interval for new zone files is 1 minute by default, and the SURBL zone files include TTLs for all records, overriding command-line -t values.

    For a private nameserver the config would look something like:

    		/usr/local/sbin/rbldnsd 
                    -u rbldns:rbldns 
                    -r /usr/local/etc/rbldnsd 
                    -b 10.10.10.10/53 
                    -a   
                    surbl.internal.example.com:dnset:multi.surbl.org.rbldnsd
    

    Where the public nameserver ACL would not be used, and where the domain should be your own domain or subdomain, ideally one not visible from the outside world. The IP would usually be from your LAN or intranet.

  3. For public nameservers only, please actively rsync and use the acl (access control list) dataset file surbl.org.acl as shown in the example above on all SURBL public nameservers in order to be able to protect the servers against abuse. The syntax of acl files is described in the rbldnsd manual pages, and ACLs are supported in recent versions of rbldnsd. The ACL only applies to public nameservers. For private nameservers, don't use the acl line unless you need your own acl file.
  4. Once access is granted, set up a cron job to rsync according to the schedule that will be provided. Make sure the files are actually updating as expected. Please use a "sleep" command of some seconds after the minute to help distribute the server load better.

Locking

VERY IMPORTANT: Make sure the rsync cron job checks to see if a previous rsync is already running. If the previous rsync is still running, then the cron job should not start another rsync. This is very important to prevent multiple rsyncs from starting if there are any unusual delays, since that could adversely impact the rsync sesrvers. If multiple rsyncs are found running from the same IP, then that IP may be blocked in order to protect the servers.

  • For example, here's a tcsh script to rsync the files from a crontab:
    		#!/bin/tcsh
    
    # rsync the zone files only if this program is not already running
    
    if ( -z /usr/local/etc/rbldnsd/lockfile ) then
      echo "rsync is running" > /usr/local/etc/rbldnsd/lockfile
      /usr/local/bin/rsync -aq "server.name.here::surbl/multi.surbl.org.rbldnsd" /usr/local/etc/rbldnsd/
      /usr/local/bin/rsync -aq "server.name.here::surbl/surbl.org.acl" /usr/local/etc/rbldnsd/
      echo -n "" > /usr/local/etc/rbldnsd/lockfile
    endif
    
    It only rsyncs if a lockfile is empty.
  • Here's a more sophisticated bash script originally by Chris Zutler that includes locking and will restart a stuck rsync process. It has been modified to remove a stale lock file, which can sometimes happen if rsync gets stuck for example. Here's a link to a text file of the script lock.sh.
    		#!/bin/bash
    
    LOCK_DIR="/tmp"
    basename=$(basename $0)
    
    print_usage() {
            echo "Usage: ${basename} [-s sleep [-t timeout] "
            echo "Run COMMAND with simple file locking. "
            echo "    -s sleep"
            echo "        Sleep between 0 and the number of seconds specified before running command."
            echo "    -t timeout"
            echo "        Attempt to kill the old process if the lock is older than the specified number of seconds."
            echo
            exit
    }
    
    while [ "${1}" == "-s" ] || [ "${1}" == "-t" ]; do
            opt="${1}"
            shift
            if [ -z "$(expr "${1}" + 0)" ] || [ "${1}" -lt 0 ]; then
                    echo "Invalid number of seconds."
                    print_usage
            fi
            case "${opt}" in
                    "-s") sleep="${1}" ;;
                    "-t") timeout="${1}" ;;
            esac
            shift
    done
    
    if [ -z "$*" ]; then
            print_usage
    else
            cmd=$*
    fi
    
    
    # set timeout to zero if not set
    if [ -z "${timeout}" ]; then
        timeout=0
    fi
    
    lock="${LOCK_DIR}/$(basename ${1}).$(echo ${cmd} | md5sum | cut -d" " -f1).lock"
    
    
    # remove stale lock file
    if [ "${timeout}" -gt 0 ]; then
            if [ -f "${lock}" ]; then
                rm "${lock}" && echo "removed stale lock ${lock}"
            fi
    
    elif (find -wholename "${lock}" -mmin +$(( ${timeout}/60 )) ) 2> /dev/null; then
            rm "${lock}" && echo "removed stale lock ${lock}: more than $(( ${timeout}/60 )) minutes old"
    fi
    
    # if no lock, write lock and do work
    if (set -o noclobber; echo -n > "${lock}") 2> /dev/null; then
            echo $$          >> "${lock}"
            echo $(date +%s) >> "${lock}"
            echo $cmd        >> "${lock}"
    
            if [ "${sleep}" -gt 0 ]; then
                    sleep $(expr ${RANDOM} % ${sleep})
            fi
    
            ${cmd}
    
            rm "${lock}"
    elif [ -n "${timeout}" ]; then
            read -d! lockpid locktime lockcmd < <(cat "${lock}" 2>/dev/null || echo NOFILE 0 0)
    
            if [ -n "$(expr "${locktime}" + 0)" ] && [ $(( $(date +%s)-${locktime} )) -gt "${timeout}" ]; then
                    # Try to kill grand children, children and then parent
                    pkill -P $(pgrep -d, -P ${lockpid}) > /dev/null 2>&1 || 
                    pkill -P ${lockpid}                 > /dev/null 2>&1 || 
                    (kill ${lockpid}; rm "${lock}")     > /dev/null 2>&1
            fi
    fi
    
    
  • Xia Qingran of SINA.com points out that FreeBSD has its own lockfile mechanism called "lockf", and he uses it to prevent multiple rsyncs. -s means silent; -t0 means terminate at 0 seconds (immediately). Please see 'man lockf' for more information:
    		* * * * * rbldns lockf -st0 /var/run/rsync_surbl.lock /usr/local/sbin/rsync_surbl.sh
    
  • Will Yardley of Caltech mentions that a similar program flock works in Linux, for example:
    		* * * * * rbldns path/to/flock -n lock_file -c "[rsync command] >/dev/null"
    as mentioned in #26 at http://stackoverflow.com/questions/9390134/rsync-cronjob-that-will-only-run-if-rsync-isnt-already-running

Public and private DNS

IP addresses and ports

The main difference between a public and private nameserver is whether public DNS queries are delegated to it. Purely private nameservers are usually set up to respond only on internal IP addresses, such as RFC 1918 spaces. Those hosting a public nameserver often also use the nameserver for their internal needs using the public IP.

It's also possible to have a split horizon function using BIND forwarding to an internal IP on a server that runs both BIND and rbldnsd on different IPs. In that case the internal uses would query the internal IP and public uses would query the public IP. Using different internal and public IPs facilitates isolating the traffic and functions, and it makes setting up different firewall policies simpler. Since DNS runs on port 53 and rsync on port 873 traffic can and should be isolated by port in your firewall.

The main issue is the rare DDOS attack against the public nameservers. This really hasn't happened to any major extent, but if it does, it can be useful to block the (abusive) public DNS queries while still allowing internal DNS queries and rsync of the zone files for internal use.

Domain name

For purely private nameservers, please use a list domain in your own domain instead of surbl.org. For example, surbl.internal.example.com would be used in the rbldnsd configuration options:

surbl.internal.example.com:dnset:multi.surbl.org.rbldnsd

The mapping above would tell rbldnsd to serve the contents of the multi.surbl.org rbldnsd zone file using a different domain name: surbl.internal.example.com. The local mail filtering application would need to be configured to query the data using surbl.internal.example.com as the list name.

You should use an internal domain that's not visible to the outside world. This will also prevent anyone outside from querying your internal mirror, as would using non-routable network numbering, appropriate firewall policies, etc. For a private nameserver, the internal domain name should match in all of the places appropriate to your particular system:

  1. rbldnsd configuration
  2. BIND forwarding
  3. mail filter configuration

Testing and delegation

  1. Test the name service using dig on test.surbl.org.multi.surbl.org:
    		dig test.surbl.org.multi.surbl.org @your.server.here
    
    
    Also test some web sites seen in recent unsolicited messages, for example, domain_to_test.com.multi.surbl.org (For private nameservers use your internal surbl domain name instead of multi.surbl.org.)
  2. Wait 20 minutes then check the SOA of multi for example with a dig:
    		dig multi.surbl.org @your.server.here soa
    
    to confirm that the zone files are updating and have a recent serial number compared to any of the other public nameservers. If the zone file serial numbers are not updating then the cron job or rsync are not working correctly. (For private nameservers use your internal surbl domain name instead of multi.surbl.org.)
  3. For internal use, configure the name resolution on your mail filters to use your new local mirror IP for resolving multi.surbl.org. The configuration of this is highly operating system and nameserver dependent. A sample forwarder statement for BIND is described in our rbldnsd with BIND on FreeBSD document.
  4. If you are hosting a public nameserver, then please let us know when the server is ready. We will add its public-facing IP to the delegation from the parent zone, causing public DNS traffic to flow to it.

Monitoring and performance

Most rbldnsd servers can run unattended without problems. The code is very stable, simple, fast and reliable. Much of the world's public and private DNSBL infrastructure runs on rbldnsd and rsync.

For public nameservers, please add the stats collector described on the SURBL Zones mailing list at: http://lists.surbl.org/mailman/private/zones/2011-January/000593.html and send us an email with the source IP you'd be sending stats from.

To see some general statistics about how much trafffic is being generated and how much ham and spam is being detected, install a monitoring script which feeds into a monitoring and graphing program such as MRTG.

Here are some follow up discussions with additional ideas and updates:

Unrelated to rbldnsd, there are many stats programs for SpamAssassin that will show the relative performance of the different rules, including those using SURBLs. Links to many of the stats programs can be found at the SpamAssassin site.

Notes

  1. rsync uses TCP port 873. DNS uses UDP and TCP port 53.
  2. It is possible to use BIND and rbldnsd on the same server IP address by using port forwarding in BIND for the zones being served by rbldnsd. This is described in some of the documents below.
  3. It is also possible to use BIND to serve the zones directly, but rbldnsd is much more CPU- and memory-efficient than BIND.
  4. It is also possible to run BIND and rbldnsd on different IP addresses on the same server, or to run them on entirely separate servers. In such cases BIND forwarding is not needed, since the other IP address can be used directly as the name server for the SURBL zones.

    In other words, when rbldnsd is running on its own IP address, set it up as an entirely independent name server, even if it happens to run on a server also running BIND (on a different address). For public nameservice, SURBL will delegate DNS traffic for the SURBL zones to the IP address rbldnsd is bound to.

    If using BIND 8 and a separate IP address for rbldnsd, be sure to set the listen-on option in order to tell BIND not to listen on the address that you intend to run rbldnsd on. This is explained in the NJABL rbldnsd document and in SURBL's rbldnsd with BIND on FreeBSD document.

  5. Some large installations install rbldnsd directly on their mail filtering servers or on name servers dedicated to serving just list zones. When doing this, be sure to also configure the local mail filtering applications and servers to use the local rbldnsd servers for name resolution of the lists.
  6. Consider using your local rbldnsd service to serve all of the other lists you use locally. If your mail servers process a high volume of messages, then the other lists will appreciate the reduction traffic to their public name servers.
  7. Note that there is a Windows port of rbldnsd and rsync. Here are instructions for setting up a Windows SURBL DNS mirror. We recommend only Unix for public nameservers, but private nameservers have been successfully set up under Windows.
  8. If you have a mix of public and private list service, it's probably a good practice to separate them onto different IPs or servers so they can have separate firewall access policies or rbldnsd configurations. In other words, you probably don't want to be a public DNS server for lists that aren't publically delegated to you.
  9. It's a common and recommended practice to have your local rsync client also act as an internal rsync server that distributes the files to all the internal machines that need them, if there are multiple internal uses.
  10. Please sign up a role account for the SURBL zones mailing list in order to keep informed about any important changes. Message volume is very low at a few per year. All SURBL users should also sign up a role account for the announcement list, which is also low-volume, in order to get any
  11. For more information about setting up rsync, rbldnsd, etc, please see some of the documents at "Mirroring zone files locally" on our Links page.

SURBL Data Feed Request

SURBL Data Feeds offer higher performance for professional users through faster updates and resulting fresher data. Freshness matters since the threat behavior is often highly dynamic, so Data Feed users can expect higher detection rates and lower false negatives.

The main data set is available in different formats:

Rsync and DNS are typically used for mail filtering and RPZ for web filtering. High-volume systems and non-filter uses such as security research should use rsync.

For more information, please contact your SURBL reseller or see the references in Links.

Sign up for SURBL Data Feed Access.

  • Sign up for data feed access

    Direct data feed access offers better filtering performance with fresher data than is available on the public mirrors. Sign up for SURBL Data Feed Access.

  • Applications supporting SURBL

  • Learn about SURBL lists