views:

402

answers:

4

I'm looking for a way to script a transparent forward proxy such as the ones that users point their browsers to in proxy settings.

I've discovered a distinct tradeoff in forward proxies between scriptability and robustness. For example, their are countless proxies developed in Ruby and Python that allow you to inspect each request response and log, modify, filter at will ... however these either fail to proxy everything needed or crash after 20 minutes of use.

On the other hand I suspect that Squid and Apache are quite robust and stable, however for the life of me I can't determine how I can develop dynamic behavior through scripting. Ultimately I would like to set quota's and dynamically filter on that quota. Part of me feels like mixing mod_proxy and mod_perl?? could allow interesting dynamic proxies, but its hard to know where to begin and know if its even possible.

Please advise.

+2  A: 

If you looking for a Perl solution then take a look at HTTP::Proxy

Not sure of any mod_perl solutions though. CPAN does bring up Apache::Proxy and Googling brings up MyProxy. However note, both of these are a bit old so YMMV but you may find them a useful leg up.

/I3az/

draegtun
+1  A: 

I've been working on a HTTP library in python, written with proxy servers specifically in mind as a use case. It isn't very mature at this point (certainly needs more testing, and unit tests), but it's complete enough that I find it useful. I don't know if it would meet any of your needs or not.

The library is called httpmessage, the google-code site is found here. There is an example of writing a proxy server on the examples page.

I'm happy to receive feedback and/or bug fixes.

Matt Anderson
+3  A: 

Squid and Apache both have mechanisms to call external scripts for allow/deny decisions per-request. This allows you to use either for their proxy engines, but call your external script per request for processing of arbitrary complexity. Your code only has to manage the business logic, not the heavy lifting.

In Apache, I've never used mod_proxy in this way, but I have used mod_rewrite. mod_rewrite also allows you to proxy requests. The RequestMap directive allows you to pass the decision to an external script:

MapType: prg, MapSource: Unix filesystem path to valid regular file

Here the source is a program, not a map file. To create it you can use a language of your choice, but the result has to be an executable program (either object-code or a script with the magic cookie trick '#!/path/to/interpreter' as the first line).

This program is started once, when the Apache server is started, and then communicates with the rewriting engine via its stdin and stdout file-handles. For each map-function lookup it will receive the key to lookup as a newline-terminated string on stdin. It then has to give back the looked-up value as a newline-terminated string on stdout or the four-character string ``NULL'' if it fails (i.e., there is no corresponding value for the given key).

With Squid, you can get similar functionality via the external_acl_type directive:

This tag defines how the external acl classes using a helper program should look up the status.

g'luck!

J.J.
I went with squid and external_acl_type, worked like a charm. Thank you.
jrhicks
A: 

I'd use squid, which can execute other programs to change the requests on the fly.

dvyjones