Open source software has historically had a hard time competing with its big budget proprietary rivals but, as Bob Gourley explains in an interview with Federal Computer Week, “Hadoop is totally unique in many ways.”
There is currently no proprietary match for Hadoop in the realm of Big Data analysis across clusters of commodity computers. Microsoft recently came to the same conclusion when it abandoned work on its Hadoop alternative to focus on integrating the open source software with its server and cloud platforms. Already, Hadoop is competing with and beating commercial software at solving Big Data problems and is critical to private companies like Twitter, Facebook, and Yahoo, as well as to federal agencies such as the NSA, which adopted Hadoop two years ago to store, share, and analyze massive amounts of unstructured data.
The NSA is using Hadoop to do what it had previously thought impossible, and it is not alone within government. As the Government Big Data Solutions Award highlighted at this year’s Hadoop World, numerous innovators and agencies are leveraging the open-source software to provide taxpayers with better and more agile service. The winner of the award, the General Services Administration’s USASearch, uses Hadoop and Hive to go from utilizing only the most essential data to storing and analyzing anything that might be useful for its 550 government agency clients.
But as with all open-source software, Hadoop comes with some challenges. Assembling a Hadoop stack then installing, configuring, and using the software on your own is a hands-on, complicated process. Users must also program applications to perform analysis on their own. That’s why companies such as Cloudera offer distributions of Apache Hadoop, support, and applications to fascilitate enterprise-grade deployments.