Mapping Session results: Hadoop platforms March 30, 2012Posted by David Card in Uncategorized.
Last week at GigaOM’s Structure:Data event, we hosted an invitation-only session for 40+ GigaOM Pro clients, where we mapped out the near-term outlook for the market for Hadoop platforms. This Mapping Session acted as a test bed for a new research approach we are developing at Pro. We wanted to tap the collective intelligence of thought leaders in our membership and within our analyst ranks to better assess which market forces will be the most critical over the next 12 to 24 months, and we also wanted to examine which companies are in the best position to exploit them. The results of our Mapping Session included the following observations:
- Lacking apps. Participants identified a potentially powerful market force that prior GigaOM Pro analysis had perhaps underplayed: the availability of applications for Hadoop that would convince senior business and IT executives of the platform’s ROI.
- Integration. Participants reaffirmed that integration was the other most important market trend in the sector.
- Early leaders. Participants rated Cloudera and MapR as somewhat better positioned to ride the trends — especially integration — than IBM or HortonWorks. None of the four companies has any current advantage in applications availability, so that is a huge opportunity.
Figure 1 illustrates our GigaOM Pro analysis of what we call the Hadoop platform market disruption vectors — that is, the six competitive areas in which companies can drive or leverage market share gains and revenue growth.
Mapping Session attendees confirmed our analysis that integration with existing databases, data warehouses, and business analytics and business visualization will be the most important vector for Hadoop platform suppliers. Data analysis and business intelligence won’t be useful if it is locked away in a Hadoop silo separate from existing processes and data running on legacy IT infrastructure. Successful Hadoop platforms will provide connectors that bring Hadoop data into traditional data warehouses as well as pull existing data out of a data warehouse into a Hadoop cluster for analysis.
In the near term, those platforms available as appliances that integrate Apache code with servers and storage have a distinct advantage in deployment. Likewise, companies that build SQL interfaces to Hadoop will gain ground in access if their rivals fail to produce familiar business-analysis user interfaces or rely on IT departments to code directly to MapReduce or in Pig.
We don’t think code optimization for reliability and performance, tools or services in support of traditional data security and compliance, or especially total cost of ownership are nearly as critical for differentiation as the other vectors. At least not until the market matures a bit.
Mappers rate applications as powerful vector
We presented our disruption vector analysis during the Mapping Session, and the room’s consensus was that integration remained the most powerful market force. But while discussing external forces and the costs of Hadoop implementation, we identified a parallel, equally important vector.
Participants decried the industry’s focus on infrastructure optimization and cost minimization at the expense of proving business value. Overall ROI should trump total cost of ownership, ran the argument, but it’s difficult to prove Hadoop platform ROI right now. That’s partly due to a dearth of packaged applications aimed at specific business functions, which, in turn, is because transactional apps are largely easier to package than analytics apps. Think about the early days of relational database deployment in contrast with the apps-driven ERP and CRM wave. The market for Hadoop platforms today is much more like the former.
That line of thinking produced a revised set of disruption vectors, as illustrated below in Figure 2. Applications availability will drive ROI analysis by business and IT management, and it will potentially prove the benefits of Hadoop investment over competing technology investments. That is, if such applications emerge. Mapping Session participants rated this as a potential trend for companies to lead or follow to success on par with integration.
Our mappers continued to rate deployment as a critical market force, with some added nuance in terms of deployment and support options. As we wrote in our “A near-term outlook for big data” report, “the best Hadoop platforms from a deployment perspective will give customers the choice of running on premise or in the cloud and as software only or in a purpose-built appliance.” Participants agreed that access or accessibility through business intelligence interfaces was a key disruption vector, but it, along with security, was slightly demoted in terms of relative importance against the top three.
The participants deemed performance worth its own measure, but we continue to believe security and performance will be differentiators beyond the 24-month horizon. For example, as big data analysis becomes more oriented to real-time decision making, the batch-oriented Hadoop will face competitive threats from alternative technologies. That will increase the importance of performance as a competitive disruption vector.
Early advantage from integration; apps to come
The next step in the Mapping Session process was to identify and analyze a handful of Hadoop platform suppliers for their strategic and tactical alignment with the disruption vectors. Participants scored the companies on a 1 to 5 scale of relative competitiveness for each vector, and when we applied the vector importance weighting, the results produce the chart below in Figure 3. The distance from the center indicates each company’s relative position versus the others. A high score in a less critical vector is less important than an average score in the most critical ones.
Mapping Session participants rated Cloudera and MapR as better positioned to take advantage of market forces in the near term than IBM or Hortonworks. Both companies did better than average on integration, partly due to OEM deals with Oracle (Cloudera) and EMC (MapR). IBM professional services seem to focus on integrating its Hadoop offering with its own installed technologies, according to panelists. On the other hand, IBM got relatively high marks for accessibility, as its BigSheets data analytics package is something of “an Excel for Hadoop.” Hortonworks’ best score came from its deployability, which matched that of Cloudera’s and MapR’s in the eyes of the participants.
All the companies received equally low marks from the mappers in the other key disruption vector of applications availability. But panelists were optimistic that the landscape could change dramatically in a short time. Eight months from now, a few critical third-party apps focused on industries like heath care or functions like internal search could completely rewrite the Hadoop platform script.
Continue the discussion
That’s why the GigaOM Pro analysis of the Hadoop platform market is a process. We welcome your input and feedback. Continue the discussion by leaving a comment below.
Also, note that this Mapping Session is an input to GigaOM Pro’s research process. Look for a Sector RoadMap report on Hadoop platforms that will crystallize our take on the market in the near future.
Mapping session panelists
Jo Maitland, Research Director, GigaOM Pro
George Gilbert, GigaOM Pro analyst and Partner at TechAlpha
Julie Lockner of Enterprise Strategy Group
Carl Brooks of Tier1 Research, a division of 451 Research
Feature image courtesy of Flickr user sarahemcc