Linux squid proxy url rewriting or redirection

From Svendsen Tech PowerShell Wiki
Jump to: navigation, search

For some reason you might want to rewrite or redirect a URL that is requested from a Squid proxy server.

There are products like "AdZap" that have pretty complex squid redirect programs, but you can make simple, working examples quite easily.

There's some (brief) information about rewriting here, with no example: http://www.squid-cache.org/Doc/config/rewrite/

There's also something called "Squirm" that can be used to rewrite or redirect URLs.



Redirecting a URL

In this example case, I want to redirect "http://foo.example.com/some_stuff?here" to "http://bar.example.com/some_stuff?here".

I also demonstrate redirecting any request going to "<anything>.forbidden-example.com" to an intranet site. It also catches the "root domain" (forbidden-example.com without "www." or "wiki." or similar).

Basically a Squid redirect program reads from STDIN, and gets some information in, including the URL. It's the first whitespace-separated element. What you print to STDOUT will be used as the URL that the Squid proxy itself requests from its parent proxy, an intranet or the Internet. I hope my terminology and understanding is correct so I am not misinforming anyone.

Example Redirect Program

Here's the example Perl redirect program for the example outlined above.

#!/usr/bin/perl
use strict;

# Turn off buffering to STDOUT
$| = 1;

# Read from STDIN
while (<>) {
    
    my @elems = split; # splits $_ on whitespace by default
    
    # The URL is the first whitespace-separated element.
    my $url = $elems[0];
    
    # Handle foo.example.com links and translate them to bar.example.com
    # with the rest of the URL intact (if present). Ignore warnings...
    if ($url =~ m#^http://foo\.example\.com(/.*)?#i) {
        
        $url = "http://bar.example.com${1}";
        
        print "$url\n";
        
    }
    
    # I'm tossing in a little elsif to demonstrate multiple exceptions.
    # This very inclusively matches any domain under "forbidden-example.com".
    # The last part, "(/.*)?", is redundant, but I added it because it might
    # be useful to some. That part would be in $1 in this example.
    elsif ($url =~ m#^https?://(?:[^.]*\.)?forbidden-example\.com(/.*)?#i) {
        
        # Redirect them to some intranet site...
        print "http://intranet/thats-forbidden.html\n";
        # If you want to handle what's in $1, you could do something like this (beware of injection...):
        # print "http://intranet.example.com/forbidden.pl?article=$1\n"
        
    }
    
    else {
        
        # Unmodified URL
        print "$url\n";
        
    }
    
}

This is simple Perl and can be adapted. Of course, knowing some Perl helps. Basically the part in the m#regex# that's enclosed in parentheses: "(/.*)" in the first if statement is made optional with the trailing question mark after the capture group (the stuff inside the parentheses). Then, if there's a match, this is stored in $1, which I've written as ${1} (equivalent), in the scalar $url variable assignment. The "i" in "m#regex#i" makes the regular expression match letters case-insensitively.

$1 will simply be "empty" if there's no part after "http://foo.example.com", so this will work even if there's no part after the initial domain. It would have produced a warning if they had been enabled.

If it doesn't match the clause in the elsif, which I described above, the URL is passed through unchanged.

What to Add in squid.conf

Somewhere in your squid.conf, add the following line:

redirect_program /path/to/the/script/above/redirect_program.pl

This assumes the file is called redirect_program.pl, and of course you need to use the correct path. Don't forget to make the file executable with "chmod 755 /path/to/file/redirect_program.pl" or similar.

Reload the Squid Configuration

You can reload the Squid configuration with:

/path/to/executable/for/squid -k reconfigure