IBM PureData – IBM PureFlex on steroids for big data workloads


A couple of months after presenting IBM PureFlex and IBM PureApplication systems, IBM has now unveiled a new IBM PureSystems family member, named the IBM PureData System. Let´s take a look at what IBM has around the corner.

Big data

It is no mystery at this point that volume of data is increasing at an insane pace. Besides volume, in the last few years big data additional inherent characteristics like velocity, variety and veracity became more important, which made the handling of this type of data workloads more complex than ever before.

Current data-intensive workloads can be categorized in 2 main groups:

Transactional Databases (PLTP & Batch)

– Optimized memory cache for focused queries & updates

– Very low latency

– Shared everything architecture

Analytic Data Warehouses

– Parallel processing of broad scope queries

– Very high i/o bandwidth

– Shared nothing architecture

This type of workloads demands a new type of system capable of reduce complexity, accelerate time to value and improve IT economics, while accelerating transformation to cloud computing.

IBM PureData

The fundamental objective of the IBM PureData System is to significantly reduce the time, money and effort deploying and running data systems.

Previous IBM PureSystems were focused on Infrastructure as a Service (PureFlex) and Platform as a Service (PureApplication) cloud categories. New IBM PureData system is IBM´s response for the Data as a Service cloud category, while maintaining the core characteristics of previous IBM PureSystems:

For managing the different workloads of transactional databases or analytics warehouses, IBM PureData will come in 2 flavors:

IBM PureData for Transactions

These systems are designed for managing the workload of a transactional data system.

Some of its key features are:

– Highly reliable and scalable databases deployed in minutes (one step) using patterns of expertise.

– Supports all existing DB2 applications with no changes required

– Supports Oracle Database applications with minimal change required

– Easy, non-disruptive upgrade to larger systems

– Self-balancing/tuning/optimizing/monitoring/healing (it won´t make coffee tough, maybe in future releases).

The PureData System for Transactions comes in 3 standard configurations (the systems can be upgraded to the next size without requiring downtime):

IBM PureData for Analytics

In the analytics department there will be 3 sub-categories:

1. For Analytics

These systems are optimized exclusively for handling data workloads.

– Powered by Netezza technology (Learn more about the Netezza technology)

– Up to 10-100x faster than traditional systems

– Data load ready in < 4 hours

– No database indexes or storage administration

– Peta-scale user data capacity

– Runs advanced algorithms in minutes (instead of days)

– Integrated hardware acceleration

2. For operational analytics

These systems are optimized for a mix of interactive and analytic queries.

– Up to 80% of queries are interactive look-ups

– Up to 1000s of concurrent queries / second

– Data load ready in hours

– Up to 50x faster maintenance with integrated system support

– Available in 5 size configurations, with capacity up to Petabytes

– Big storage savings due to adaptive data compression

– Dynamic self-tuning

– Automated “multi-temperature” data management

3. For Hadoop analytics

These systems are ready to use Hadoop system for big data analytics.

– Up to 30% faster processing with Adaptive MapReduce

– Complete Hadoop system deployed in hours

– Up to 216TB in a single rack

– Integrated analytics to accelerate insights from Big Data

– Built in Security and High Availability


IBM claims that this systems beats the competition in performance, availability, costs, time to value, customer satisfaction and  technology leadership, which is a very big statement indeed.

Since  PureApplication System is designed and optimized to run a mix of application and database servers and  PureData System is designed exclusively for data (no applications run on this system) they will make a beautiful couple. Take this example:

Of course we´ll have to wait until we can get or hands on one of these boxes and see them performing in real-life situations before we can determine if they are up to IBM´s expectations.

If so, I believe some very interesting challenges waits ahead us.


Comments Off on IBM PureData – IBM PureFlex on steroids for big data workloads
Hernan Scavetta

About Hernan Scavetta

Hernan Scavetta is a Senior IT Specialist with 15 years of experience in the design, deploy and support of multi-platform IT infrastructure services. During his career he worked for large companies in the Retail, Energy, Telecommunications and Healthcare sectors. He currently holds VMware and Microsoft certifications. Additionally, he has been dictating VMware installation and configuration courses since VMware VI3 at the Service Desk Institute. Currently, he is a VMware Subject Matter Expert for one of the top accounts of the GDCA. You can contact Hernan at