How Facebook moved 30 petabytes of Hadoop data

Derrick Harris | GigaOM | July 27, 2011

For anyone who didn’t know, Facebook is a huge Hadoop user, and it does some very cool things to stretch the open source big-data platform to meet Facebook’s unique needs. Today, it shared the latest of those innovations — moving its whopping 30-petabyte cluster from one data center to another.

Facebook’s Paul Yang detailed the process on the Facebook Engineering page. The move was necessary because Facebook had run out of both power and space to expand the cluster — very likely the largest in the world — and had to find it a new home. Yang writes that there were two options, physical migration of the machines or replication, and Facebook chose replication to minimize downtime.

Once it made that decision, Facebook’s data team undertook a multi-step process to copy over data, trying to ensure that any file changes made during the copying process were accounted for before the new system went live. Perhaps not surprisingly, the sheer size of Facebook’s cluster created problems:...