views:

472

answers:

3

I have heard the 'shard' technique mentioned several times with regard to solving scaling problems for large websites. What is this 'shard' technique and why is it so good?

+9  A: 

Karl Seguin has a good blog post about sharding.

From the post:

Sharding is the separation of your data across multiple servers. How you separate your data is up to you, but generally it’s done on some fundamental identifier.

Oded
+2  A: 

In brief, imagine seperating your users_tbl across several servers. So Users 1-5000 and on Server 1, Users 5000-10000 on Server 2; etc. If your data model is sufficiently abstract in code, it's often not a huge change in code.

Of course this approach becomes difficult if all your queries are similar to "SELECT COUNT(*) FROM users_tbl GROUP BY userType" but when your where is "WHERE userid = 5" then it makes more sense.

Tom Ritter
+2  A: 

As 'sharding' is part of the architecture principles for large websites, you may be interested in listening to 'eBay's Architecture Principles with Randy Shoup' here.

Spear