Hortonworks Leads a Fast and Growing Herd of Hadoop
Hadoop, the big-data technology, has transformed businesses’ ability to cost-effectively store and process large volumes of data for analysis. Numerous companies have invested in supporting Hadoop, and some produce commercial versions of the open source technology. At last year’s Hadoop Summit Hortonworks had just started to establish itself as one of these providers.
Hadoop, the big-data technology, has transformed businesses’ ability to cost-effectively store and process large volumes of data for analysis. Numerous companies have invested in supporting Hadoop, and some produce commercial versions of the open source technology. At last year’s Hadoop Summit Hortonworks had just started to establish itself as one of these providers. Now, at the 2012 Hadoop Summit, with a new CEO, Rob Bearden, a new head of marketing, John Kreisa, and other hires, it is moving fast to advance its Hadoop momentum.
Hadoop is one of the leading big-data technologies, and according to our benchmark research on the topic, almost one-third of organizations plan to use it, challenging approaches such as in-memory databases and data warehouse appliances. Hortonworks addresses the growing demand to analyze large volumes of transactions, which has been the Achilles’ heel of the business analytics industry. Significant preparation has been required to develop predefined views and cubes to handle high-end analytic processing, as older tools were not architected to handle such large volumes of data. Even today, businesses routinely use old flat files in larger-scale environments like predictive analytics – our research shows it’s the second-most common technology today.
Hortonworks has released Hortonworks Data Platform (HDP) 1.0, which is built on Hadoop 1.0 and version 0.20.205, a proven release with many core components including HDFS for storage, MapReduce for distributed processes, HBase for nonrelational databases, Pig for scripting, Hive for query, Oozie for workflow and scheduling, Ambari and Zookeeper for management and monitoring and HCatalog, WebHDFS and Squoop for data integration with Talend Open Studio for Big Data. The partnership with Talend was announced earlier this year.
Hortonworks provides a 100-percent open source data platform closely aligned with the core Apache code line. On top of Apache’s Hadoop code, Hortonworks has built enterprise capabilities, including a simple installation and configuration process and better cluster deployment, which help reduce the staffing and training required to use Hadoop; those are obstacles for more than three-quarters of organizations. The Hortonworks Management Center helps keep track of Hadoop performance through system dashboards. On the high-availability side, Hortonworks supports intelligent fail-over and restart on NameNode, MapReduce and core operating systems daemons. Hortonworks wraps this technology with its technical support and access to expert resources to help organizations get to full deployment. Its cluster-level subscription supports developers and production deployments. The company is now working on commercialization of Hadoop 2.0, which advances the application in the areas of reliability, performance and scalability.
One Hortonworks feature I like is the use of Apache HCatalog, a centralized metadata service that can bridge the definitions to the data. It helps provide consistent metadata, allowing applications to share data as tables in and out of HDFS that can be processed for any level of operational or analytical needs. Competitor Teradata Aster has also announced support of HCatalog and has created SQL-H, as I recently analyzed, which can help in query and retrieval into Hadoop.
The recent Hadoop Summit brought together dozens of technology providers that are supporting its ecosystem of applications, integration and infrastructure and systems management. The top benefit of big-data technologies like Hadoop is their ability to retain and analyze more data. To achieve this, one of Hortonworks’ partners, Datameer, has built direct analytics support for analysts that does not require prestaging of the data, and introduced its new business infographics at the Summit. Another partner, Splunk, has made its software easier to use to search and retrieve machine data using Hadoop, as I have already assessed. Such partnerships add value to these efforts and provide businesses new insights.
Hortonworks is now a key ecosystem provider for Hadoop. It has become a significant challenger to competitor Cloudera, which had a head start. Now it will be a race to see who can build the largest community of developers and technology partners to advance adoption. Hortonworks has even partnered with Microsoft in its efforts to enter the Hadoop and big-data technology market. Now Hortonworks needs to gain production deployments and take Hadoop beyond the early-adopter big data environments into mainstream information management efforts in organizations across the world.
If you are considering what is possible with Hadoop and how it can support your enterprise needs, take a look at Hortonworks.
Mark Smith – CEO & Chief Research Officer
Filed under: Big Data, Business Analytics, Business Intelligence (BI), Information Applications (IA), IT Performance Management (ITPM) Tagged: Analytics, Apache, BigData, DataMeer, Hadoop, HortonWorks, Teradata Aster
You must log in to post a comment.