Looking behind the curtain: How virtual applications really work

In my last post (“The circle of life“), I discussed how IBM PureApplication System manages the complete lifecycle of virtual applications.  At one point in that entry, I started to get a bit ahead of myself by asking the questions:

“So how does IBM PureApplication System do all of this?  How can it possibly understand what is necessary to facilitate all of the various application components and then monitor, manage, and scale the application throughout the complete application lifecycle?”

I’ve demonstrated virtual application patterns many times to many customers.  They are all universally amazed at the simplicity and quick time-to-value that can be obtained using these patterns of expertise.  Getting a complete system up and running in a few minutes and that is optimized, highly available, and includes the ability to elastically scale based upon demand is a pretty impressive sight!  After the initial amazement sinks in, questions like the two mentioned above start to emerge as customers consider what this might mean for their very own applications.  It all seems a bit too magical and they want to know at a deeper level how this works.  After all, if they are going to trust their applications (essentially their business) to this system, they need to know a little more about how it accomplishes it all before they give over the control that they have so carefully managed up to this point.

So, in this post I’d like to pull the curtain aside and uncover the magic a bit to explain how virtual applications really work – at least at a high level.  It is not my intent to provide a “how-to” guide for creating virtual application pattern types – many people are simply interested in a basic understanding of how it works.  If you are interested in creating your own content, this post should give you a good overview but you will need to reference other sources (including our Plug-in Development Kit) to dig deeper.  Therefore, my goal is to provide an overview of the intricate dance between IBM PureApplication System and the plug-ins that contain much of the expertise.  I hope to remove the mystery and assure you that while this appears mysterious – it is really just a way to simplify the process and solidify the expertise into an easy-to-use, consistent, responsive and seemingly magical solution.

So what is a virtual application pattern again?

Before I get into the specifics of the how, it is probably best to give a brief recap of the what — simply to get everybody on the same page regarding virtual application patterns.  A virtual application pattern is one of the “patterns of expertise” in IBM PureApplication System.  It is a radically simplified way to define and manage an application.  The definition is in terms that make sense to the application owner rather than a systems architect.  With virtual applications, you don’t have to build the infrastructure from the ground up – allocating hardware, provisioning the operating system and software, configuring the network, installing monitoring agents, and all the other steps that were previously required to create the necessary environment to host your application.  You simply specify your application components with functional and non-functional requirements of the application (basically your service level agreements).   IBM PureApplication System takes care of the details to launch, configure, provide high availability, monitor, manage, and scale your application to meet changing demand – all in an integrated and simplified common user interface.

Impressive – how does it do all of that?

So how does IBM PureApplication System do all of this?  Is all of this intelligence built into IBM PureApplication System itself?  In a manner of speaking, it both is and isn’t.  IBM PureApplication System provides the structure and orchestrates the solution.  But the detailed intelligence for each specific middleware component is included within the virtual application pattern type installed in the system – and the pattern type is composed of a collection of plug-ins that really do the heavy lifting.  Working together, IBM PureApplication System and the pattern type plug-ins provide the necessary intelligence to host a virtual application.  In this way, IBM PureApplication System can provide expert capabilities for a wide variety of applications and these capabilities can be easily extended and enhanced with new expertise.

So what are pattern types and plug-ins?

The next logical questions are: “What exactly are pattern types and plug-ins?  What do they do and how do they do it?”  As I’m sure you already deduced, the pattern types and the plug-ins do quite a bit.  They encapsulate the expertise so that the users who are building patterns from these elements don’t need to know or manage all the details.

A pattern type represents a versioned collection of middleware components that have been tested and verified to work together to deliver a particular solution.   For example, the web application pattern type includes several various components and functionality from several different products that have been optimized to work together for on-line transaction processing web applications.  The actual logic and metadata to support the various elements are packaged in plug-ins – and so a simple way to view a pattern type is as a collection of plug-ins that are guaranteed to work well together for a specific type of application.

A plug-in is the basic packaging unit that encapsulates all of the binaries, model transformation logic, node parts and parts (more on this later), and metadata to support some aspect of a virtual application.  Plug-ins can encapsulate a lot of information.  For example, any visual elements that are presented to a user in the virtual application builder when creating a virtual application pattern are provided by metadata and icons included in one or more plug-ins.  The intelligence to understand how to convert these elements, defined in terms of the application, into middleware configurations in a topology model are provided by transformation elements included in  plug-ins – including such details as required bit architectures, memory, and storage.   The binary images necessary to install and run the middleware components can also be packaged in plug-ins.  Finally, the logic necessary to react to changes in the configuration or dependencies is also included in plug-ins. Basically the plug-in includes the detailed definition and logic necessary to take a virtual application element from concept to reality in the cloud.  IBM PureApplication System is then the higher level intelligence – the orchestrator and integrator, if you will, that acts to control the integration, visualization, virtualization, execution, and lifecycle management of all intelligence provided by the plug-ins and pattern types.

Pattern types and plug-ins are delivered as compressed .tar files and installed in the system using the web-based user interface, the REST API, or the command-line API provided in IBM PureApplication System.

A little more about the pattern type…

A pattern type archive includes a patterntype.json file that defines basic information about the pattern type such as name, version, description, prerequisites, and license information.  The archive can also include a license folder with license documentation in various locales and a locale folder to store translated strings in the form of message.json files for use by the plug-in elements. The archive can optionally include plug-ins and the associated files used by the plug-ins.

A little more about the plug-in…

A plug-in archive must include a config.json file that defines the details of the plug-in and reference to the content.  The content of the plug-in in the archive includes any number of additional files such as images, .json files for various purposes, locale files, transformation logic, OSGi bundles, and scripts.

That’s the high-level explanation of pattern types and plug-ins from a packaging perspective – but that doesn’t quite remove the mystery, does it?  So, let’s break down the capabilities provided in a plug-in a little further.

Definition of a plug-in: config.json file

Each plug-in includes at least one mandatory file, the config.json file.  This file is the basic definition of the plug-in.  Along with standard identifying information such as name and version, the definition also includes the primary and secondary pattern types where this plug-in will appear.  Although it is true that pattern types can optionally bundle a set of plug-ins with the pattern type archive – the specification of the relationship between the plug-in and the pattern type is from the perspective of the plug-in definition.  Having the relationship specified from the plug-in facilitates the ability to extend pattern types with new capabilities and share plug-ins between multiple pattern types.

The config.json file also includes information about the various packages and parameters that are part of the plug-in.  Packages represent the various file packages that are part of the plug-in.  These packages can specify information about the required environment including OS, bit-level, and memory.  The files packages are specified with part and nodepart elements.  Parts are typically used to install, configure, and start middleware or other software.  Nodeparts are also software elements but are generally used to install, configure, and start system level software or lay down common scripts that can be used by parts — for example, scripts used in configuring firewall settings.

Visual elements seen in the Virtual Application Builder: metadata.json file

Plug-ins can provide an appmodel/metadata.json file that specifies information used in the Virtual Application Builder such as the type of element (component, link, or policy), name, label, description, and icons elements.  Also included are attributes that can be exposed along with the attribute type, default values, constraints, and how these attributes should be rendered on the user interface.

Transformation elements: Creating a topology

After a user has created a pattern in the Virtual Application Builder, the user next wants to deploy that pattern into the cloud.  The first step in this provisioning process is to convert the components, policies, and links of the virtual application pattern (along with attributes and values) into a topology representation, which will be refined and eventually provisioned to specific virtual machine instances working together in the final solution.

IBM PureApplication System works with transformers provided by the plug-ins for each element in the pattern.  The transformers provide fragments of an unresolved topology document (a JSON object document) that IBM PureApplication System assembles together into a complete topology document.  These transformers can be provided by the plug-ins as templates of the intended JSON fragment (PureApplication System embeds Apache Velocity as a template engine), Java classes, or a mixture of both. Whenever possible, using templates is preferable.  The JSON fragments include information such as packages to be installed, parameters, system requirements for memory and disk, and default values.  IBM PureApplication System assembles these fragments together to generate an unresolved topology from the virtual application model.  The next step is to convert the unresolved topology to a resolved topology.  The resolve step uses information from the config.json and knowledge of the cloud group to which this is being deployed — selecting appropriate package parts based on the cloud characteristics (such as machine and bit architecture).  At this point, the totals for CPU, memory, and disk requirements are also generated in preparation for provisioning.  The provisioning phase considers the resolved topology document, provisions any required shared-services, and uses the current state of the cloud environment to generate a final topology prior to leveraging the infrastructure as a service (IaaS) layer to deploy the images and generate a deployment document.

Activating the virtual machine images

A common base image is used by IBM PureApplication System to provision a virtual machine based on the requirements of the topology document.  This base image includes a startup process that is the primary actor during provisioning.  Provisioning of each virtual machine proceeds first with nodeparts and then parts.  Nodeparts and parts represent various elements of middleware and system configuration that must be installed to host the various application components.  Nodeparts are processed prior to parts because — as mentioned earlier — they typically involve lower level settings or features that can be shared by parts.   Actually, to be yet a little more specific, parts are processed by a special nodepart that is the last nodepart to run — called the workload agent.

For each virtual machine, the startup process begins by downloading the topology document and getting the vm-template for this node.  For each nodepart in the vm-template, the startup processing will download the nodepart.tgz file, extract the content, and invoke the setup/setup.py script if present. After all nodeparts are downloaded and set up, any installation scripts found in the common/install directory are invoked in ascending order, by name.  These scripts are only invoked once on provisioning the machine.  Finally, any scripts in the common/start directory are invoked in ascending order.  Start scripts are invoked on each start or restart of the virtual machine.

After all other nodeparts have been processed, the workload agent nodepart is given control to process all parts.  Parts contain the binary and lifecycle scripts associated with roles and dependencies for software components.  Parts typically can have associations and dependencies with other parts that require careful configuration and coordination as parts are started and stopped.  To manage this complexity, parts use roles to coordinate the startup of all virtual machines and software in the deployment.  Roles have states and lifecycle scripts that facilitate changes between states.

In a manner similar to the startup processing for nodeparts, the workload agent processes all parts in its vm-template in the topology document by downloading the part.tgz file and extracting its content.  For each part, the mandatory install.py script is executed to install the part.  All parts are installed serially.

Lifecycle management of roles

Following installation of a part, any role lifecycle scripts included in the part are executed.  All role scripts are optional, run on an independent thread, and appear under /{role} in the part.  A role is a managed entity within a virtual application instance. As a role changes, its state is reflected in the virtual application instance view. The role begins in the INITIAL state.  The first optional role lifecycle script to run is {role}/install.py and results in transitioning the role state to INSTALLED.  If provided, the role is configured using the {role}/configure.py script, which transitions the state to CONFIGURING and finally the {role}/start.py, which if successful, will transition the state to RUNNING. A role can move to TERMINATED (if stopped) or FAILED (if an error is encountered) at any point.

In short, if all goes well, the role state progression is:

INITIAL → INSTALLED → CONFIGURING → STARTING → RUNNING

The workload agent will react to changes in role dependencies by invoking the {role}/{dependency}/changed.py script, and changes in peers by invoking the {role}/changed.py script, if either exists.  This approach ensures that any changes are communicated so that appropriate action can be taken by the corresponding role.  There are many optional lifecycle scripts. Scripts can also be associated with configuration changes or actions the plug-in chooses to expose in the virtual application console.  All scripts can leverage a series of maestro utilities (you can see how the name emphasizes the orchestration theme) to perform various tasks such as downloading and extracting files or decoding encoded strings.

Summary

So as you can see — a lot goes on behind the curtain!   Although it might appear as if it is all by magic — really, the magic is from a team of experts providing an integrated solution, orchestrated by IBM PureApplication System, and simplified so that those  users without the same detailed expertise can easily use the solution.  Best of all — you can rest assured that the integration and optimization has been codified by the real experts who know it the best — either from IBM or one of our IBM Business Partners. And if you have a need to deliver your own expert integrated solution for a completely custom, home-grown solution, you can have your experts create or extend plug-ins and pattern types using this open, extensible system to provide your own pattern to share across your organization.   Let the magic begin!

Comments: 2
Joe Bohn

About Joe Bohn

Joe has 26 years of experience in software development. He is serving as the technical evangelist for IBM Workload Deployer since May 2011. He has addressed various audiences on IBM Workload Deployer from direct customer interactions to webinars and conferences. He has been blogging on IWD using DeveloperWorks for the past few months. Follow me on Twitter @jabber63
This entry was posted in application system, middleware, Patterns, Virtualization and tagged , . Bookmark the permalink.

2 Responses to Looking behind the curtain: How virtual applications really work

  1. deep says:

    thanks for the useful article.i just wanted to ask that does workload agent works as a scheduler for installing "parts" and what are the maestro utilities which help in downloading

  2. Addison Goering says:

    Thank you for a nicely written and detailed discussion on virtual applications. This entry goes WAY beyond the marketing hype and pulls back the covers. There are a lot of folks out here who want to see this level of technical detail. Keep up the great work!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>