views:

122

answers:

1

I'm writing a URL shortener similar to tinyurl and I'm wondering how to keep track of URL's that are already shortened using my service? For example, tinyurl generates the same tiny URL for the same long URL regardless of who creates it. How can this be achieved that is scalable? Bitly also does this though they generate a new URL per person. However, they are able to track the aggregate (total # of) clicks for the long URL - How?

Thanks,

+1  A: 

They store the URLs in their database, associated with the short URL(s). How else would it be done?

ceejayoz
Well then how do you look up the long URL efficiently if we assume there are over a million URL's in the DB? As the # of URL's grow the slower the look up.
java_pill
java_pill, it seems like the long URL lookup is exactly as efficient as the short URL lookup you have to do every time someone clicks on one. They're 1-to-1, right? If that's not efficient enough, your system won't work anyway.
bmb
java_pill, also "over a million" is *not* a number that scares most DBAs. Modern DBs with good indexing can handle much more.
bmb
@java_pill A million URLs is tiny. Bit.ly shortened 2.1 billion URLs in November, says TechCrunch - http://www.techcrunch.com/2009/12/14/bit-ly-pro-google-suck-it/ - and it's all entirely within the capability of a (properly, professionally scaled) database. Facebook, with its hundreds of millions of users and billions of connections, also runs off a SQL database.
ceejayoz