A couple of months after presenting IBM PureFlex and IBM PureApplication systems, IBM has now unveiled a new IBM PureSystems family member, named the IBM PureData System. Let´s take a look at what IBM has around the corner.
It is no mystery at this point that volume of data is increasing at an insane pace. Besides volume, in the last few years big data additional inherent characteristics like velocity, variety and veracity became more important, which made the handling of this type of data workloads more complex than ever before.
Current data-intensive workloads can be categorized in 2 main groups:
Transactional Databases (PLTP & Batch)
- Optimized memory cache for focused queries & updates
- Very low latency
- Shared everything architecture
Analytic Data Warehouses
- Parallel processing of broad scope queries
- Very high i/o bandwidth
- Shared nothing architecture
This type of workloads demands a new type of system capable of reduce complexity, accelerate time to value and improve IT economics, while accelerating transformation to cloud computing.
The fundamental objective of the IBM PureData System is to significantly reduce the time, money and effort deploying and running data systems.
Previous IBM PureSystems were focused on Infrastructure as a Service (PureFlex) and Platform as a Service (PureApplication) cloud categories. New IBM PureData system is IBM´s response for the Data as a Service cloud category, while maintaining the core characteristics of previous IBM PureSystems:
These systems are designed for managing the workload of a transactional data system.
Some of its key features are:
- Highly reliable and scalable databases deployed in minutes (one step) using patterns of expertise.
- Supports all existing DB2 applications with no changes required
- Supports Oracle Database applications with minimal change required
- Easy, non-disruptive upgrade to larger systems
- Self-balancing/tuning/optimizing/monitoring/healing (it won´t make coffee tough, maybe in future releases).
The PureData System for Transactions comes in 3 standard configurations (the systems can be upgraded to the next size without requiring downtime):
In the analytics department there will be 3 sub-categories:
These systems are optimized exclusively for handling data workloads.
- Powered by Netezza technology (Learn more about the Netezza technology)
- Up to 10-100x faster than traditional systems
- Data load ready in < 4 hours
- No database indexes or storage administration
- Peta-scale user data capacity
- Runs advanced algorithms in minutes (instead of days)
- Integrated hardware acceleration
These systems are optimized for a mix of interactive and analytic queries.
- Up to 80% of queries are interactive look-ups
- Up to 1000s of concurrent queries / second
- Data load ready in hours
- Up to 50x faster maintenance with integrated system support
- Available in 5 size configurations, with capacity up to Petabytes
- Big storage savings due to adaptive data compression
- Dynamic self-tuning
- Automated “multi-temperature” data management
3. For Hadoop analytics
These systems are ready to use Hadoop system for big data analytics.
- Up to 30% faster processing with Adaptive MapReduce
- Complete Hadoop system deployed in hours
- Up to 216TB in a single rack
- Integrated analytics to accelerate insights from Big Data
- Built in Security and High Availability
IBM claims that this systems beats the competition in performance, availability, costs, time to value, customer satisfaction and technology leadership, which is a very big statement indeed.
Since PureApplication System is designed and optimized to run a mix of application and database servers and PureData System is designed exclusively for data (no applications run on this system) they will make a beautiful couple. Take this example:
If so, I believe some very interesting challenges waits ahead us.