What is the difference between proxy server and reverse proxy server?
Although my understanding from an apache perspective is that Proxy means that if site x proxies for site y, then requests for x return y.
The reverse proxy means that the response from y is adjusted so that all references to y become x.
So that the user cannot tell that a proxy is involved...
A proxy server proxies (and optionally caches) outgoing network requests to various not-necessarily-related public resources across the Internet. A reverse proxy captures (and optionally caches) incoming requests from the Internet and distributes them to various internal private resources, usually for HA purposes.
Still the difference is not very clear to me. Google and wikipedia didn't help much. Can someone expalin with an example?
The difference is primarily in deployment. Web forward and reverse proxies all have the same underlying features, they accept requests for HTTP requests in various formats and provide a response, usually by accessing the origin or contact server.
Fully featured servers usually have access control, caching, and some link mapping features.
A forward proxy is a proxy that is accessed by configuring the client. The client needs protocol support for proxy features (redirection, proxy auth, etc.). The proxy is transparent to the user experience, but not to the application.
A reverse proxy is a proxy that is deployed as a web server and behaves like a web server, with the exception that instead of locally composing the content from programs and disk, it forwards the request to a origin server. From the client perspective it IS a web server, so the user experience is completely transparent.
In fact, a single proxy instance can run as a forward and reverse proxy at the same time for different client populations.
That's the short version, I can clarify if people want to comment.
The previous answers were accurate, but perhaps too terse. I will try to add some examples.
First of all, the word proxy describes someone or something acting on behalf of someone else.
In the computer realm, we are talking about one server acting on the behalf of another computer.
For the purposes of accessibility, I will limit my discussion to web proxies, however, the idea of a proxy is not limited to web sites.
FORWARD proxy
Most discussion of web proxies refers to the type of proxy known as a "forward proxy."
The proxy event in this case is that the "forward proxy" retrieves data from another web site on behalf of the original requestee.
A tale of 3 computers (part I)
For an example, I will list three computers connected to the internet.
- X = your computer, or "client" computer on the internet
- Y = the proxy web site, proxy.example.org
- Z = the web site you want to visit, www.example.net
Normally, one would connect directly from X --> Z.
However, in some scenarios, it is better for Y --> Z
on behalf of X
,
which chains as follows: X --> Y --> Z
.
Reasons why X would want to use a forward proxy server:
Here is a (very) partial list of uses of a forward proxy server.
1) X is unable to access Z directly because
a) Someone with administration authority over
X
's internet connection has decided to block all access to siteZ
.Examples:
The Storm Worm virus is spreading by tricking people into visiting
familypostcards2008.com
, so the system administrator has blocked access to the site to prevent users from inadvertently infecting themselves.Employees at a large company have been wasting too much time on
myspace.com
, so management wants access blocked during business hours.A local elementary school disallows internet access to the
playboy.com
web site.A government is unable to control the publishing of news, so it controls access to news instead, by blocking sites such as
wikipedia.org
. See TOR or FreeNet.
b) The administrator of
Z
has blockedX
.Examples:
The administrator of Z has noticed hacking attempts coming from X, so the administrator has decided to block Z's ip address (and/or netrange).
Z is a forum web site.
X
is spamming the forum. Z blocks X.
REVERSE proxy
A tale of 3 computers (part II)
For this example, I will list three computers connected to the internet.
- X = your computer, or "client" computer on the internet
- Y = the reverse proxy web site, proxy.example.com
- Z = the web site you want to visit, www.example.net
Normally, one would connect directly from X --> Z.
However, in some scenarios, it is better for the administrator of Z
to restrict disallow direct access, and force visitors to go through Y first.
So, as before, we have data being retrieved by Y --> Z
on behalf of X
, which chains as follows: X --> Y --> Z
.
What is different this time compared to a "forward proxy," is that this time the user X
does not know he is accessing Y.
A Reverse Proxy is typically less visible than a "forward proxy", and requires no configuration or special knowledge by the client, X
.
The client X probably thinks he is visiting Z
directly (X --> Z
), but the reality is that Y is the invisible go-between (X --> Y --> Z
again).
Reasons why Z would want to set up a reverse proxy server:
- 1) Z wants to force all traffic to its web site to pass through Y first.
- a) Z has a large web site that millions of people want to see, but a single web server cannot handle all the traffic. So Z sets up many servers, and puts a reverse proxy on the internet that will send users to the server closest to them when they try to visit Z. This is part of how the Content Distribution Network (CDN) concept works.
- Examples:
- Apple Trailers uses Akamai
- Jquery.com hosts it's javascript files using CloudFront CDN sample.
- etc.
- Examples:
- a) Z has a large web site that millions of people want to see, but a single web server cannot handle all the traffic. So Z sets up many servers, and puts a reverse proxy on the internet that will send users to the server closest to them when they try to visit Z. This is part of how the Content Distribution Network (CDN) concept works.
- b) The administrator of Z is worried about retaliation for content hosted on the server, and does not want to expose the main server directly to the public.
- a) Owners of Spam brands such as "Canadian Pharmacy" appear to have thousands of servers, while in reality having most websites hosted on far fewer servers. Additionally, abuse complaints about the spam will only shut down the public servers, not the main server.
In the above scenarios, Z
has the ability to choose Y.
Links to topics from the post:
Content Delivery Network
- Lists of CDNs
forward proxy software (server side)
- cgi-proxy
- phproxy (discontinued)
- glype
- Internet censorship wiki: List of Web Proxies
reverse proxy software for HTTP (server side)
- apache mod_proxy
- squid
- nginx (written by russians, used on hulu.com, spam sites, etc.)
- HAProxy
- lighthttpd
- perlbal (written for livejournal)
- pound
- varnish cache (written by a freebsd kernel guru)
reverse proxy software for TCP (server side)
- balance
- delegate
- pen
- pure load balancer (web site defunct)
- python director
see also:
Heres an example of a reverse proxy (as a load balancer).
A client surfs to website.com and the server it hits has a reverse proxy running on it. The reverse proxy happens to be 'pound' (look it up in google). Pound takes the request and sends it to one of the three application servers sitting behind it. In this example Pound is a load balancer. ie. it is balancing load between three application servers. The application servers serve up the website content back to the client.