Making Hadoop More Consumable

Now that we are past the buzz of Big Data, we are starting to see some consistent use cases emerging and more and more customers moving into production with Big Data projects.  IBM has identified five key big data use cases after hundreds of customer engagements.  These use cases which include fraud,  deeper understanding of the customer, new data exploration and data warehouse augmentation have emerged as some of the more popular ways customers are leveraging Big Data.

Real use cases, real value –so why isn’t everyone installing Hadoop?  Well there are also some real challenges. First, Hadoop skills are in short supply, many clients are having to self teach resources and adapt to the new technology, this has slowed things down quite a bit.  Second, the open source software which continues to improve is simply not as complete of a solution as what we are used to with data warehousing and analytics.  The task of actually assembling servers and loading software seemed like a challenge we have already conquered, but here we find ourselves again making integration a delivery milestone. In addition, once you get the systems all racked, stacked and loaded – managing them presents another challenge. And then there is security and high availability,  all those enterprise class characteristics we have come to love in our appliances.

All this before we even get to actually leverage the data and see how this technology can provide some real value.

IBM has been thinking about these challenges, and how we can help ease the implementation pain.  InfoSphere BigInsights provides now 3 ways for customers to leverage Hadoop.  There is a free download available on the web so you can take it for a rest drive.  If you  also want to learn something about Hadoop be sure to check out Big Data University.  This is a free resource to help understand and leverage the technology.

If you are looking for something more in the enterprise class, there is an enterprise edition of BigInsights which includes enhanced security, visualization, built in analytics and a development platform.  This week BigInsights announced version 2.1, which now includes GPFS for additional performance and high availability. They also announced BigSQl, which is an ANSI SQL interface to Hadoop, allowing Hadoop to be a lot more consumable to most clients with depth in SQL.

puredata hadoopAlso announced this week is one more way to make Hadoop more consumable – PureData System for Hadoop. This system is the latest in the PureSystems line up taking direct aim at simplifying Hadoop for the Enterprise.  The new PureData System will leverage the enterprise edition of BigInsights and directly address the need for a more complete solution for those customers wishing to leverage Hadoop.

PureData for Hadoop, like the other PureSystems, is an expert integrated systems that significantly helps accelerate time to value by as much as 8 times faster for the initial install and implementation. Because it’s integrated by design, the system is up and running in hours, not weeks and management is simplified with a single console view for both hardware and software.

PureData  for Hadoop leverages the value added features of BigInsights,  and it can also help accelerate time to insight with built in visualization and analytic accelerators for social data, machine data and text analytics.

Most important, PureData for Hadoop is designed to enable the uses cases we see emerging as key for Hadoop.  For example,  in data warehouse augmentation, many customers are leveraging Hadoop to offload the data warehouse  or as a queryable archive.   This can bring a significant cost savings benefit for cold data or for compliance data as an example. PureData for Hadoop  has built in archiving tools that simplifies the movement from the data warehouse to Hadoop.

And because the system can be installed and leveraged so quickly, it also makes a great environment for new data exploration where previously untapped data can now be used to enhance existing analytics or discover new capabilities.

PureData System for Hadoop is architected for high availability and offers integration to best in class Guardium Software for enterprise class security.

IBM is meeting customers with Hadoop solutions no matter what phase of the project they are starting with, all with more value and solutions that help accelerate better time to value and meet the challenges of consuming this technology in their enterprise.

Editor’s Note:

If you are interested in learning more about the recent announcement don’t miss the following events:

  • Video Chat: Big Data Management After Launch Chat on April 5th at 2pm ET
  • Twitter Chat: Exploring the Enterprise Challenges of Leveraging Hadoop, use #bigdatamgmt on April 10th 12pm ET
  • Webinar: Big Data at the Speed of Business on April 30th at 10am ET and 9pm ET

Comments Off
Nancy Kopp-Hensley

About Nancy Kopp-Hensley

Nancy Kopp-Hensley has been in the data warehousing and BI industry for over 19 years. Nancy worked in the early days of enterprise data warehousing and executive reporting. In 2004 Nancy lead the team who brought the first IBM data warehouse appliance to market. From the field Nancy moved into the development organization focusing on data warehouse solutions and database technology. Today Nancy works in Product Management and Strategy in the Netezza organization on the overall IBM data warehouse product and business strategy.