Virtual Machine Management in the Cloud – aspects of automating the process

VM Management

Managing VMs in the Cloud is usually manual, intrusive and time consuming.  It would include patching and maintaining the existing VM estate; procuring and deployment new VMs, assessing costs; and workloads.

A goal with many firms with large VM estates is proper VM management through automation.  This includes VM approval workflow, and automated provisioning systems, change control and ensuring that the security model is implemented.  The following are some key considerations in automating VM management.

 

New VMs – an automated approach

VM(s) are provisioned through a console or CLI access.  In some more advanced architectures, an ITSM portal will have a pattern and application deployment integration with an automated environment provisioning (eg. An ITSM calls Jenkins jobs, identifying the parameters to be set up in the Cloud dev account based on patterns within the ITSM).

In this scenario the cloud service portal could however, be programmed to automatically generate a change control request, with a completed status, upon every successfully automated provisioning event. Any exceptions or errors in the automated provisioning process will be handled through alerts and generate a “completed” change ticket when the VM is online.

The above is not easy to accomplish.

Usually each VM will be a fairly manual process, through a CLI, a Powershell script, a Bash script, or a console, and will conform to preapproved and security-certified OS images, applications, and patch levels.

 

Changes to servers and hosts

Requested VM or host changes would follow normal change control procedures already in place. Routine maintenance, updates, security patches, and new software revisions will also follow existing change control procedures. A typical exception is the creation of VMs within a Dev-Test account since these VMs sit behind a firewall to keep noncertified development applications isolated from production networks. With this service, VMs do not require a change control to allow the developers to do their job without unnecessary delays.

 

Updates to the Common Operating Environment (COE)

Cloud services should automatically deploy templates or build-images of standard configurations. These COE templates will contain all updates, patches, and security certifications. These COE images can be automatically deployed within the cloud environment without going through the typical manual accreditation process for each server.

All updated COEs will again go through the manual security approval process, and then they can be ordered and deployed using the cloud’s auto- mated systems.

Key Take-Away

Creating and managing too many common operating environments or VM templates can quickly become costly and unmanageable. Transitioning to the cloud should be accompanied by better discipline and standardization for COEs and templates.

 

Adding a server (VM) to a network and domain

As each VM is automatically provisioned, it will automatically be added to the network domain. This will be an automated process, but the specific steps required as well as legacy change and security control policies involved need to be adjusted to facilitate this; in the past, this process of joining the domain typically required manual security approval.

 

User and administrator permissions to new servers and hosts

As new machines are automatically added to the network, permission to log on to the new OS will be granted to the cloud management system, usually by using a service account. Specific steps to automate this process and adjustments to the existing security processes will need to be made to accommodate this automated process.

 

Network configuration requests

Every VM-based server has a preconfigured network configuration. In the case of an individual machine—physical or VM—standard OS and applications are installed that require outbound initiation of traffic within the pro- duction network, and possibly to the Internet.

All network configuration, load balancing, or firewall change requests follow existing procedures. When possible, the cloud management self- service control panel will allow customers to configure some of this by themselves, although advanced network changes will need to go through normal change control and possibly security approval.

In some VM templates or COEs, there might be multiple servers deployed as part of a COE. For example, a complex COE might include one or more database servers, middleware application servers, and possibly frontend web servers; this collection of VMs is called a platform. In these situations, the VMs have already been configured (as part of the overall platform package) to communicate with one another via the virtual networking built into the VM hypervisor. In the given example, only the frontend web servers would have a production network address, whereas all other servers are essentially “hidden” within the VM network enclave.

 

VM configuration changes

Customers may have the ability to upgrade or downgrade their VM CPU, memory, or disk space within the cloud management portal. Changing this configuration requires a reboot of the customer’s VMs, but no loss of data.

If a customer requests a manual change through a support ticket, the cloud provider will make this change using the cloud management software so that billing and new VM configurations are automatically updated. Do not make changes to the backend hypervisor directly, or the cloud management system will have no knowledge of that change.

 

Key Take-Away

Manually changing the VM configurations is not the appropriate process; billing and configuration management will not be aware of the new settings, and the downstream asset and change control databases will not be updated. Never make a manual change to a configuration that the cloud management system cannot track.

 

Release management

All VM templates, COEs, and software shoiuld be fully tested in an offline lab or staging network. It will then be quality checked and security approved before any changes to production cloud service is scheduled. Due to the level of automation and precertification of VM compliance and security, software, updates, and so on, release management will be an ongoing effort with increased impact. If new releases go into production with errors or inadequate planning and testing, automation and the cloud will multiply the impact compared to traditional IT.

 

==END