Cloud Storage & Computer: key points about the Virtualisation of Storage and Compute

There are 2 important points about storage and cloud computing:

 1) The relationship between storage and cloud compute is clear. You cannot have cloud computing without storage, and currently, disk-based storage (or hard drives) is the primary method. Advancements in solid-state drives (SSDs) and memory-based storage will ultimately bring about the end of disk platters within storage systems. We will also see technologies such as memory resistors (or memristors, for short) potentially replace all existing storage devices. This storage technology is important both in terms of capacity and performance.

2) Using virtualized storage including Logical Unit Numbers (LUNs) on the SAN, will greatly increase your VM flexibility and portability.  You will not be tied down to a physical server and direct-attached storage; but rather a pool of virtual, detached storage units.  You will be able to move VMs anywhere within your server farm using a virtualized storage array, and you will be able to rapidly increase your storage availability.

Virtualization of storage is typically implemented by using a SAN or other hardware and software devices that present massive pools of storage through a unified storage management interface. When you configure and start a VM, the needed amount of storage is allocated from the existing pool of available storage logical unit numbers (LUNs) on the SAN. Storage is mapped to VMs over a SAN or network fabric or switch in most cases.

A standard method is to utilize local storage installed within each physical server when configuring hypervisors and VMs.  Some modern clouds and storage systems still recommend numerous scaled-out storage nodes with direct-attached storage rather than deploying a SAN.  This is not however best practice, especially if you want to be Cloud-enabled because you will lose some significant capabilities.

You need to virtualize your storage.  For example, if you have a VM mapped to a storage LUN (not an internal hard drive), you can then relocate the VM to another physical server anywhere in your server farm, or even another data-center, and still have it map back to the correct storage LUN. If the storage on the local physical server were used, the only way to have the VM “move” to another physical server would be to replicate all of the data to the other physical server.

In a true cloud computing environment, you don’t know which other physical server the VM will be moved to, and therefore you don’t know where to pre-copy the data. This virtualization of storage is a key technology used in cloud computing and most modern data-centers, because it utilizes both the flexibility of VMs with the flexibility of virtual storage mapping.

Storage needs, whether structured or unstructured, increase massively each year. To handle this data, there are two possible solutions: either store less data for less time or keep increasing the storage. Many organizations are finally starting to realize that they cannot keep up with the amount of new data being created and are thus setting limits on what should be stored and more important, evaluating whether we really need to keep aging data forever. For those organizations that cannot delete older data, due perhaps to compliance or legal reasons, storage technologies such as compression and de-duplication come into play.

As for the importance of storage as it relates to cloud computing, the back- end storage system for a cloud service requires unique characteristics that might not normally be required of traditional server farms:

  • Ability to provide multiple types of cloud storage (e.g., object, block, application-specific), regardless of actual physical storage hardware
  • Ability to quickly replicate data or synchronize across data-centers.
  • Ability to take a snapshot of data while the system still operational. These are used for backup or restoration to a point in
  • Ability to de-duplicate data across entire enterprise/storage
  • Ability to thin-provision storage
  • Ability to back up data offline from production networks, and back up huge data online without system
  • Ability to expand storage volumes on the fly while the service is still operational
  • Ability to maintain storage performance levels, even as data changes and load It also must have the ability to automatically groom data across multiple disk types and technologies in order to maintain maxi- mum performance levels.
  • Ability to recover now-unused storage blocks when VMs are shut down (auto-reclamation).
  • Ability to virtualize all storage systems, new and legacy, so that all storage is represented as a single storage
  • Ability to provide multiple tiers of storage to give customers a variety of performance levels, each at a price per gigabyte or terabyte, depending on need.

The figure below depicts how server virtualization combined with key cloud attributes is the foundation for cloud computing. Important virtualization and cloud characteristics (e.g. automation, scalability, and self-service) can easily be part of any data-center modernization project, even if cloud computing is not the goal.

Figure – Virtualization + cloud attributes = cloud computing

 

==END