views:

989

answers:

3

How do services like TinyURL or Metamark work?
Do they simply associate the tiny URL key with a [virtual?] web page which merely provide an "HTTP redirect" to the original URL? or is there more "magic" to it ?

[original wording] I often use URL shortening services like TinyURL, Metamark, and others, but every time I do, I wonder how these services work. Do they create a new file that will redirect to another page or do they use subdomains?

+22  A: 

No, they don't use files. When you click on a link like that, an HTTP request is send to there server with the full URL, like http://bit.ly/duSk8wK (not a real one). They read the path part (here duSk8wK), which maps to their database. In the database, they find a description (sometimes), your name (sometimes) and the real URL. Then they issue a redirect, which is a HTTP 302 response and the target URL in the header.

This direct redirect is important. If you were to use files or first load HTML and then redirect, the browser would add TinyUrl to the history, which is not what you want. Also, the site that is redirected to, will see that referrer (the site that you originally come from) as being the site the TinyUrl link is on (i.e., twitter.com, or your own site, wherever the link is). This is just as important, so that site owners can see where people are coming from. This too, would not work if a page gets loaded that redirects.

PS: there are more types of redirect. HTTP 301 means: redirect permanent. If that would happen, the browser will not request the bit.ly or TinyUrl site anymore and those sites want to count the hits. That's why HTTP 302 is used, which is a temporary redirect. The browser will ask TinyUrl.com or bit.ly each time again, which makes it possible to count the hits for you (some tiny url services offer this).

Abel
Very nice and very explained. Thanks :-)
Nathan Campos
Considering it's just a map, a little light on the lifetime of each shortened url?
Kaushik Gopal
Actually I think, Bit.ly uses HTTP 301 instead of 302 (the last I heard)
KennyCason
Since bit.ly won't let you change where one of their URLs points to, 301 makes sense. No need to remember the bit.ly version and recheck it.
Joost Schuur
@KennyCason / @Joost Schuur: it is indeed HTTP 301 that is used, however, with a timestamp. This turns it into a `Moved` not `Moved Permanently`. This is a subtle difference. By adding the timestamp, the browser considers it should check whether the resource is changed or not when this timeout it reached. Others, like is.gd, use a normal `301 Moved Permanently` and the browser doesn't need to re-check (but often will). Finally, services like url4.eu do not redirect at all, but show you an advertisement first. With the 301 the services can still count *unique visitors*, but not all hits.
Abel
+7  A: 

Others have answered how the redirects work but you should also know how they generate their tiny urls. You'll mistakenly hear that they create a hash of the URL in order to generate that unique code for the shortened URL. This is incorrect in most cases, they aren't using a hashing algorithm (where you could potentially have collisions).

Most of the popular URL shortening services simply take the ID in the database of the URL and then convert it to either Base 36 [a-z0-9] (case insensitive) or Base 62 (case sensitive).

A simplified example of a TinyURL Database Table:

ID       URL                           VisitCount
 1       www.google.com                        26
 2       www.stackoverflow.com               2048
 3       www.reddit.com                        64
...
 20103   www.digg.com                         201
 20104   www.4chan.com                         20

Web Frameworks that allow flexible routing make handling the incoming URL's really easy (Ruby, ASP.NET MVC, etc).

So, on your webserver you might have a route action that looks like (pseudo code):

Route: www.mytinyurl.com/{UrlID}
Route Action: RouteURL(UrlID);

Which routes any incoming request to your server that has any text after your domain www.mytinyurl.com to your associated method, RouteURL. It supplies the text that is passed in after the forward slash in your URL to that method.

So, lets say you requested: www.mytinyurl.com/fif

"fif" would then be passed to your method, RouteURL(String UrlID). RouteURL would then convert "fif" to its base10 equivalent, 20103, and a database request will be made to redirect to whatever URL is stored under the ID 20103 (in this case, www.digg.com). You would also increase the visit count for Digg by one before redirecting to the correct URL.

This is a really simplified example but you should be able to get the general idea.

Jericho
Good additions, esp. the part on the encoding.
Abel
really its that easy. I hate when I don't think of the simple things.
Nathan Feger
+3  A: 

I know this was more of a general question, but the source code to tr.im, my favorite URL Shortening service, is on github if you want to take a look.

Jorge Israel Peña
I will check! ;-)
Nathan Campos