No-Cost Ad-Blocking

PAC files are even better than HOST files for blocking Web site ads.

By Sheryl Canter

 

You can block distracting, privacy-invading Web site ads with tools that come with your computer - no need to buy a thing. One way is through the HOSTS file. A lesser known but more powerful method is through a Proxy Automatic Configuration (PAC) file - a feature of all modern browsers.

 

Both HOSTS and PAC files block ads by redirecting ad server requests, but HOSTS files only can block entire sites, while PAC files can block URLs within a site. This is important because some companies that serve ads also serve other content.

 

HOSTS files have other limitations, as well. Because each ad server is listed separately, they can get long enough to slow the system. PAC files can use regular expressions and be much shorter. HOSTS files are bypassed on systems using proxy servers, but PAC files are not. Plus if your computer is running a Web server, there are problems in redirecting ads with a HOSTS file that don't exist with a PAC file.

What are PAC Files?

PAC files were introduced by Netscape with the release of JavaScript in 1996, and all modern browsers support them. They consist of a file saved with the file type .pac that defines the JavaScript function FindProxyForURL(url, host). If your browser is configured to use a PAC file, FindProxyForURL is called for every URL accessed, even if JavaScript is turned off. The function's return value says how to access the URL: proxy server, SOCKS server, or direct connection. The idea of using PAC files to block Web ads was conceived by John R. LoVerso (http://www.schooner.com/~loverso/no-ads/), while he was immersed in finding and documenting security flaws in the first release of JavaScript.

 

It's good to understand how PAC files work, but you don't have to start from scratch. I wrote a PAC file for my own use that's heavily commented for the curious: http://www.sherylcanter.com/articles/pac-file.zip. Open the file in WordPad for editing; NotePad won't show the line breaks.

 

PAC files support some special functions, two of which are useful for blocking ads:

 

      dnsDomainIs(host, domain)

            Detects whether the URL host name belongs to a given DNS domain.

 

      shExpMatch(str, shexp)

            Checks whether str (could be the URL or host name) matches a shell expression.

 

In Internet Explorer, shell expressions support only the ? and * wildcards. To use full regular expressions, assign a regular expression literal to a variable:

 

var re_demark = /(\.|-|=|\/|_)(cash|estat|hit|iplace|goto|pop)(\.|-|=|\/|_)/i;

 

Then use the test method of the regular expression object:

 

if  (re_demark.test(url))

      return "PROXY localhost:3421";

 

Notice that the blocked sites are redirected to port 3421 of localhost, so as not to conflict with any Web servers that might be running on port 80. Redirecting to an unused port like 3421 causes no problems for IE or Mozilla, but Opera will complain that there is no proxy at that address.

 

The solution is to run a small, single purpose Web server that responds to ad requests with a transparent bitmap. This also eliminates the unsightly error messages that appear when the ads aren't found, and prevents delays in browsers that take a while to time-out when content isn't found.

 

BlackHoleProxy, written by Larry Wang, does just this. It can be used with HOSTS or PAC files, and you can download it for free, with source code, at http://s91363763.onlinehome.us/BlackHoleProxy/.

 

BlackHoleProxy provides crucial features that similar utilities do not (such as the ability to change the port), but there is no install program, and options are set through the command line. For easy access, create shortcuts for the command line options you'll need, plus another shortcut to the documentation, and then create a folder for these in your Start menu (see Figure 1).

 

To use BlackHoleProxy with an ad-blocking HOSTS file, you must set it to port 80 by launching it with this command line:

 

BlackHoleProxy -port 80

 

Once you have BlackHoleProxy loaded, you can configure your browser to use the PAC file. You'll find PAC support in the network or connection settings. You specify the file using a syntax like this:

 

file://C:/PacFiles/ad-block.pac

 

If you're using Internet Explorer, you have to change two additional settings. First, open the Internet Options dialog to the Security tab, select "Local intranet", click the "Sites" button, and uncheck the box labeled "Include all sites that bypass the proxy server."

 

Second, you must turn off the auto proxy caching mechanism, since it prevents being able to restrict some server content while allowing other content. Microsoft didn't provide an interface to this setting, but you can use a clever .REG file written by Bill Talcott (http://www.schooner.com/~loverso/no-ads/IE-auto-proxy-cache.reg) to not only change the option, but add a checkbox for it on the Advanced page of the Internet Options dialog.

 

Last but not least, don't forget to clear your browser's cache after setting up your HOSTS or PAC file, or the ads will be retrieved from your cache.

 

 

Sheryl Canter is a contributing editor to PC Magazine. Her Web site is at www.SherylCanter.com.