views:

734

answers:

8

Does it use any of the standard ones like Oracle, DB2, SQL Server, or have something of their own?

Considering the type of data (text + images + videos) that they have to manage, it would be interesting to know how they deal with it.

Is this information publicly available? Any links would also be helpful.

+1  A: 

They use Casandra and MySQL, see here about Casandra http://www.facebook.com/note.php?note_id=24413138919

SQLMenace
Yep. This appears to confirm use of mysql: http://www.facebook.com/MySQLatFacebook and facebook *wrote* cassandra.
Frank Farmer
+1  A: 

MySql

Source: http://www.datacenterknowledge.com/archives/2008/04/23/facebook-now-running-10000-web-servers/

They may have migrated since, but I doubt it.

glowcoder
They migrated some of their stuff to Casandra
SQLMenace
Truth be told, I'd found it hard to believe they have only one database type! Do you know how much is migrated? The article (from '08) says they have 1800 MySql databases. Surely that has grown with their userbase, which puts them at roughly 4k by now.
glowcoder
If you see my answer, they also utilize Hadoop for some things as well.
Thomas Owens
+2  A: 

They use Apache Cassandra for some of their storage (document database), and heavy use of memcached to make it scale well.

Mikael Svenson
They run primarily on MySQL. Cassandra powers only a small percentage of their databases.
mattbasta
Seems reasonable, and why I said they use cassandra for some of the storage, not all. Mysql + memcached is popular indeed.
Mikael Svenson
+1  A: 

According to Wikipedia's Hadoop page and the PoweredBy page on the Hadoop site, they use Hadoop. However, the text on the Hadoop page reads:

We use Hadoop to store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning.

That makes me think that their user profiles are not stored in Hadoop.

Thomas Owens
A: 

I'm fairly sure they used to use MySQL, however they now use some sort of NoSQL database for heavier transactions. The number of transactions Facebook has to handle is sometimes too much for a relational database. You see, relational databases must adhere to the principle of ACID. It is costly to maintain ACID on a large scale. NoSQL variants don't adhere to as strict of a set of rules as relational databases do.

Polaris878
+1  A: 

I've discussed this extensively with some sysops from Facebook in the past.

Facebook primarily uses MySQL for structured data storage. For instance, wall posts, user information, etc. are all stored in MySQL. They replicate this between their various data centers.

For blob storage (photos, video, etc.), Facebook makes use of a custom solution that involves a CDN (fbcdn) externally and NFS internally.

For a few means of document storage and write-heavy applications (such as inbox search), Cassandra is used. Contrary to popular belief, Cassandra is NOT the primary database at Facebook. In fact, it isn't anywhere close to being the primary database platform; it used for very specific scenarios where the NoSQL paradigm fits best.

Hope this helps

EDIT:

I should also note that this is by no means the full extent of technologies that FB uses, but it does represent the vast majority of storage that they take advantage of.

mattbasta
+3  A: 

It should be no surprise that an site as high-scale as Facebook uses a variety of data management technology. Each database product has its strengths, and Facebook needs all of them.

They have also changed their data management from time to time, as they find solutions that meet their needs.

According to Exploring the software behind Facebook, the world’s largest site (2010/6/18):

  • MySQL
  • Memcached
  • Haystack for photo retrieval
  • Cassandra
  • Hadoop and Hive
  • Scribe for high-speed distributed logging
Bill Karwin
+1  A: 

If you are interested in what technologies Facebook uses, follow their engineering "blog". http://www.facebook.com/Engineering

There is lots of good stuff in there.

Brent Baisley