It’s Facebook. When Social networking giant Facebook becomes a public company this Friday, May 18, IT departments should ignore the awe of its market capitalization. Departments should look at how the social networking company implemented Big Data.
As defined on Wikipedia, Big data is defined as consisting of “data sets that grow so large and complex that they become awkward to work with using on-hand database management tools.”
When Facebook entered social networks at a time when MySpace and Friendster were the dominant players, how did the company grow hundreds of millions of users each year, as its competitors network choked due to heavy data loads?
Minutes spent on a mobile device exceeded the time spent on the desktop with Facebook’s users recently. The data load on Facebook servers just keeps growing.
Facebook successfully managed big data analytics by using Hadoop/Hive, which is a complicated system that does not achieve real-time goals. As of September 2011, the site handled 200,000 events per second. Users will be most familiar with “Like.” This interaction is central to Facebook’s analytics. Facebook stores these events in memory and acts as a short-term primary data store. Long-term data is stored on disk.
A simplistic model for a complicated system.
While the valuation of Facebook will be the main focus in the days and weeks ahead, the company’s contribution to technology is the real-time analytics of big data deserves special attention. Its role in monetizing consumer “likes” for advertisers will take a back seat, but the infrastructure is already there to be figured out.