While Hadoop has been the poster child for big data, the open source platform’s future looked bleak in Gartner’s recent Hadoop Adoption Study, where the analyst firm stated that demand “looks fairly anemic over at least the next 24 months”. Many people latched on to that statement and headlines portrayed adoption as declining.
In reality, the market is not only healthy, but accelerating. Although not in the headline, the Gartner report also uncovered that nearly half of respondents either have adopted, are in the process of adopting, or are planning to invest in Hadoop.
This, along with the fact that Gartner, which specialises in advising traditional enterprise IT organisations, is paying such close attention to Hadoop, is a positive indicator of adoption and is actually quite impressive for such a young technology. It signals that it is becoming mainstream in the enterprise IT landscape.
Perhaps the most direct signal of Hadoop adoption is found in the financial results of Hortonworks, the only publicly traded Hadoop distributor. In its most recent earnings report, the company beat revenue estimates by more than 30%. In Q2 it grew its revenue by 154% and subscription revenue by 178%, not to mention added 119 customers – up from 105 in Q1.
>See also: Gartner reveals bleak outlook for Hadoop
The company has also tracked how quickly new customers expand their clusters on an annual basis. In Q2 the average annual net expansion rate was more than 100%, meaning Hortonworks customers are more than doubling their clusters on an annual basis. Meanwhile, a Barclay’s report estimated that Hortonworks is on a path to be the fastest company ever to reach $100 million in revenue.
When you combine the Gartner data and Hortonworks financial information it’s obvious that Hadoop is growing quickly – and much, if not most, of that growth is in the enterprise.
While that growth is impressive, there are also inhibitors. Hadoop is still a young technology – it is changing quickly and tackling the intimidating problem of how to store, process, query and manage data at grand scale.
Not surprisingly, Hadoop isn’t devoid of challenges. The Gartner report highlights several – the top two being obtaining the right skills and determining how to get value from the platform.
Companies often get started by simply moving their data into Hadoop, which the industry has taken to calling a data lake. These initiatives don’t always have a clear path or plan for how the data will be used from once in Hadoop to create value.
The conventional wisdom is that with all this data available in one place, good things will happen. While an important first step, a data lake is not enough for predictable and sustainable value creation.
To create real value, organisations need to do something with the data and that will require leveraging not just Hadoop’s storage capabilities, but also its processing capabilities, at scale.
There are two basic scenarios that can deliver real business value by leveraging Hadoop storage and processing capabilities.
The first scenario is best described as operational efficiency, and involves the recreation or migration of an existing process to Hadoop to achieve cost savings, improved flexibility and generally better performance.
For example, data warehouse and mainframe offloading. By offloading some of the workloads that are expensive and likely inefficient in traditional repositories to Hadoop, customers can defer upgrades, reduce costs and gain flexibility for future analytics.
This is a low-risk approach to start using Hadoop that has a measurable ROI and known requirements. In addition, the offload can be phased with each phase delivering value.
The second scenario is business transformation, where organisations can identify a use case that meets a specific business objective that was otherwise unattainable. ComScore, for example, is using Hadoop to monetise its unique data streams by providing analytics on mobile advertising campaigns, which requires specific business insights and an ability to source and process the required data.
When successful, these projects can create game-changing value for companies, but they tend to be larger, more complicated and longer-term in nature.
Still, these initiatives represent the real potential of the platform and represent the basis for the both the market hype as well as the aggressive adoption rates.
However, a thoughtful sequencing of Hadoop initiatives can often determine whether the platform gains momentum or gets stuck in pilot mode. Many of the most successful customers start with an operational efficiency scenario to achieve any initial payback and to build skills and a core infrastructure. They then can move quickly to business-transformation use cases with less risk and at a low marginal cost.
Now for the other challenge cited by Gartner: a lack of skills. Many companies claim they can’t implement Hadoop due to a lack of talent – a result of the purported skills gap.
>See also: What’s eating Hadoop’s lunch?
What perhaps has gone unnoticed here is that with Hadoop, big data professionals don’t need to continually develop new skills – there are a variety of tools available that make it easy to leverage existing skillsets. These tools are improving rapidly and cover multiple domains.
There are some who will look to the platform to solve the skills gap. While Hadoop is evolving rapidly, it is unlikely that the platform itself will close all the usability gaps. The broad partner eco-system created by the major distributions is recognition of this fact.
If there is a skill concern, look for add-on tooling that can help address the gap. Hadoop is indeed a platform and as a result there are billions being invested in tools and solutions to make it more digestible.
With more companies looking to make the most of their Hadoop investments and the technology’s willingness to change to meet evolving market needs, the platform is sure to see continued success and adoption.
Despite how its survey findings were portrayed, Gartner has echoed Hadoop’s foreseeable future success, noting it as “healthy, and growing [with] an enormous amount of upside adoption potential”.
As Hadoop continues to make its way further into the mainstream, companies that leverage the storage and compute power of the platform, for operational efficiency and transformational use cases, and leverage add-on tooling appropriately, will be best positioned to generate real value.
Those that set their sights on a data lake or ignore the broader ecosystem will move more slowly and struggle to tap into the real power of the platform.
Sourced from Josh Rogers, president, Syncsort