Friday, May 09, 2008

Scaling Facebook's databases

Impressive numbers from Facebook on their architecture: 1,800 MySQL servers (900 pairs of master/slave) holding a heavily partitioned data set managed by just 2 DBAs.

Failures are handled by promoting a slave to master and then putting in a new slave quickly. It all must be heavily scripted and automated to be managed by just two people. Very nice.

Please see also the garbled but tolerable video from the panel, "Scaling MySQL -- Up or Out?", from the 2008 MySQL Conference where these numbers were disclosed.

I have to say, YouTube looked pretty bad sitting up there on the panel refusing to provide any numbers at the same time Facebook was being so forthcoming. But, Cuong Do Cuong's talk last year, "YouTube Scalability", from the Google Scalability Conference has some nice details on YouTube's architecture and lessons learned, though no specific numbers.

Please see also comments on the numbers from James Hamilton and Om Malik.

4 comments:

Amit said...

Hi Greg,

Incase your or others have not seen it, there is a video presentation by Facebook's log team,
next.yahoo Big Data: Viewpoints from the Facebook Data Team
http://tinyurl.com/6hvfkf


Quite interesting including how they involved upto Hadoop and building on top of it.

Regards
Amit

Greg Linden said...

Thanks, Amit. That link seems to be broken, but here is a link that seems to work:

http://next.yahoo.net/archives/79/big-data-viewpoints-from-the-facebook-data-team

Anonymous said...

Very interesting that Facebook has that many MySQL instances. They must store content there as well as user information. I guess they partition the data by user id. I wonder how they render the dynamic content, e.g, if a person change her status, does that get propagated in real time to her friends or get delayed a bit to the next batch processing.

Anonymous said...

http://next.yahoo.net/archives/79/big-data-viewpoints-from-the-facebook-data-team

is a brilliant talk..