News & Notes

How you can help

The success of SURBLs will depend on several things which you may be able to help with:
  1. Please report any false positives to us so that we can review and remove them as appropriate. For further list removal information please start with the Lookup page.

  2. Use multi.surbl.org instead of individual lists since it reduces the number of DNS queries needed if you're using more than one list. Current versions of SpamAssassin use multi.surbl.org by default, as should all SURBL-compatible applications.

  3. If you run a high volume mail server (e.g., processing more than a few hundred thousand messages per day), then please set up rbldnsd, then fill out our Data Feed request form to request access to the SURBL zone files. BIND zone files are also available by rsync. Please see Links for some references and instructions on using rbldnsd and rsync. Please do not use public name servers for processing large volumes of mail. This is true for any DNS blacklists you may be using.

  4. If you are using SpamAssassin, please upgrade to version 3.1 or later since it uses SURBLs most correctly. You will often get the best overall performance by running the latest version, which is therefore recommended.

  5. Please consider helping to port or write applications such as MTA filters or mail filter plug-ins to use SURBLs. Our Implementation Guidelines provide an overview of the functionality needed. The Links page lists some of the existing applications.

  6. If you have any information about ccTLDs that are not in our two-level-tld list, please let us know at our contacts.

News

Internal

External

Update on using URIDNSBL with SURBL

Thanks to Justin Mason adding the command urirhsbl to URIBL, it is now possible to use SURBL with SpamAssassin 3 and URIDNSBL to perform our intended name to name comparisons of domains in spams. Here's Justin's description of the use of the new command:
'The current SVN trunk now contains URIBL support for RHSBL lookups using the 'urirhsbl' command:
urirhsbl NAME_OF_RULE rhsbl_zone lookuptype

Specify a RHSBL-style domain lookup. "NAME_OF_RULE" is the name of the rule to be used, "rhsbl_zone" is the zone to look up domain names in, and "lookuptype" is the type of lookup (TXT or A). Note that you must also define a header-eval rule calling "check_uridnsbl" to use this.

An RHSBL zone is one where the domain name is looked up, as a string; e.g. a URI using the domain "foo.com" will cause a lookup of "foo.com.uriblzone.net". Note that hostnames are stripped from the domain used in the URIBL lookup, so the domain "foo.bar.com" will look up "bar.com.uriblzone.net", and "foo.bar.co.uk" will look up "bar.co.uk.uriblzone.net".'

Here is a sample rule to use urirhsbl with SURBL from the URIBL config file:
urirhsbl	URIBL_SC_SURBL	sc.surbl.org.	A
header		URIBL_SC_SURBL	eval:check_uridnsbl('URIBL_SC_SURBL')
describe	URIBL_SC_SURBL	Contains a URL listed in the SC SURBL blocklist
tflags		URIBL_SC_SURBL	net

SpamAssassin Corpus Test Results

  1. Here are some test results from Daniel Quinlan on 6 April 2004 showing a 60% spam hit rate on some four-day recent spams and 0% false positives on ten months' worth of ham (non-spam messages):
    "Justin added a 'urirhsbl' test to the URIBL module, so I retested on my last 4 days of spam (the ham here ranges from 0 to 10 months old) using SURBL and it exceeded my highest expectations.
      OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
         6491     1497     4994    0.231   0.00    0.00  (all messages)
      100.000  23.0627  76.9373    0.231   0.00    0.00  (all messages as %)
       13.804  59.8530   0.0000    1.000   1.00    0.01  T_URIBL_SC_SURBL
       14.374  61.5230   0.2403    0.996   0.99    1.00  URIBL_SBL
        0.277   0.8016   0.1201    0.870   0.55    1.00  URIBL_DSBL
    
    I went ahead and promoted T_URIBL_SC_SURBL to URIBL_SC_SURBL"
    Where T_URIBL_SC_SURBL above refers to sc.surbl.org; URIBL_SBL presumably refers to sbl.spamhaus.org (a more conventional numeric RBL); URIBL_DSBL probably refers to a dsbl.org RBL.

  2. Here are results on additional lists from Justin Mason on 25 June 2004:
    OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
     121405    22516    98889    0.185   0.00    0.00  (all messages)
    100.000  18.5462  81.4538    0.185   0.00    0.00  (all messages as %)
     13.453  70.3766   0.4925    0.993   1.00    1.00  SURBL_WS
      3.807  20.3811   0.0334    0.998   0.50    1.00  SURBL_SC
      2.650  14.2565   0.0071    1.000   0.50    1.00  SURBL_AB
      0.019   0.0933   0.0020    0.979   0.50    1.00  SURBL_PH
     12.624  67.6275   0.1001    0.999   0.50    1.00  SURBL_OB
    
    Note that the spam detection rates for the relatively fast-moving AB and SC lists tend to be much higher when the corpus is constrained to their respective time windows of 7 days and 3 days ago. 60% detection rates have been seen for them in actual production systems.

  3. As of 15 September 2005, the default scores for SpamAssassin 3.1.x SURBL rules are commensurately high for SC, JP and AB:
    score URIBL_AB_SURBL 0 3.306 0 3.812
    score URIBL_JP_SURBL 0 3.360 0 4.087
    score URIBL_OB_SURBL 0 2.617 0 3.008
    score URIBL_PH_SURBL 0 2.240 0 2.800
    score URIBL_SC_SURBL 0 3.600 0 4.498
    score URIBL_WS_SURBL 0 1.533 0 2.140
    
  4. Here are results of the 6 May 2006 SpamAssassin mass checks:
      SPAM%     HAM%     S/O    RANK   SCORE  NAME
     181939    52229    0.777   0.00    0.00  (all messages)
    77.6959  22.3041    0.777   0.00    0.00  (all messages as %)
    28.8009   0.0000    1.000   1.00    0.00  URIBL_SC_SURBL
    34.2378   0.0134    1.000   1.00    0.00  URIBL_WS_SURBL
    31.9854   0.0115    1.000   1.00    0.00  URIBL_JP_SURBL
    15.9889   0.0000    1.000   0.98    0.00  URIBL_AB_SURBL
    29.9463   0.0479    0.998   0.96    0.00  URIBL_OB_SURBL
     0.3028   0.0038    0.988   0.67    0.00  URIBL_PH_SURBL
    19.7803   0.0383    0.998   0.95    0.00  URIBL_SBL
    38.1606   0.2585    0.993   0.85    0.00  URIBL_BLACK
     0.0264   0.0000    1.000   0.50    0.00  URIBL_RED
     0.4353   0.7946    0.354   0.45    0.00  URIBL_GREY
    
    Of particular relevance are the low false positives of some of the SURBL lists such as SC, AB and PH as shown in the low HAM% numbers. (Note that PH is important to use and score highly in order to detect phishes. It doesn't detect a large percentage of spams, but it likely detects many phishes.) The last three rules use uribl.com lists.

    Note that the SC and JP rules are the highest-ranked (best-performing) of all SpamAssassin rules.

  5. The SpamAssassin Rule QA site has current (weekly) scores of rule hits on spam and ham corpora. Spam hits are good, but ham hits are very bad. The goal is to maximize the former while minimizing the latter. Ham hits make a given rule much less useful so it's arguably most important to minimize those as a first priority.

    Here's a snapshot as of 2 June 2008:

      SPAM%    HAM%   S/O%   RANK   SCORE   NAME
    52.3073  0.0103  1.000   1.00    0.00   URIBL_SC_SURBL
    45.3383  0.0159  1.000   0.99    0.00   URIBL_AB_SURBL
    69.8813  0.0286  1.000   0.99    0.00   URIBL_JP_SURBL
    34.9355  0.0404  0.999   0.96    0.00   URIBL_WS_SURBL
     1.4461  0.0024  0.998   0.92 	 0.00   URIBL_PH_SURBL
    54.2485  0.1110  0.998   0.90    0.00   URIBL_OB_SURBL
    

Some SpamCopURI + SURBL results

Here's some additional data from a live, small (~4000 messages/day) SA installation showing the relative hit rates of some different technologies:
rule                    spam hitrate
BAYES_99                0.91598
DCC_CHECK               0.61738
RAZOR2_CHECK            0.45551
SPAMCOP_URI_RBL         0.37685
RCVD_IN_BL_SPAMCOP_NET  0.36902
RCVD_IN_SORBS           0.30930
1 or more BIGEVIL       0.28707
RCVD_IN_SPAMHAUS_XBL    0.25700
RCVD_IN_DSBL            0.20964
RCVD_IN_DYNABLOCK       0.19646
RCVD_IN_NJABL           0.16928
RCVD_IN_SBL             0.16310
"SPAMCOP_URI_RBL" is SpamCopURI + SURBL.
news.html version 3.06 on 10/18/09