Scaling from an IT resource perspective is the ability of the IT resource to handle increased or decreased usage demands.
The following are types of scaling:
Horizontal Scaling – scaling out and scaling in
Vertical Scaling – scaling up and scaling down
The next two sections briefly describe each.
Horizontal Scaling
The allocating or releasing of IT resources that are of the same type is referred to as horizontal scaling (Figure 1.1). The horizontal allocation of resources is referred to as scaling out and the horizontal releasing of resources is referred to as scaling in. Horizontal scaling is a common form of scaling within cloud environments. Virtual machines are created based on demand for resources. If the demand for the server resources drops, the VMs are ‘decommissioned’ or turned off.
Figure 1.1
An IT resource (Virtual Server A) is scaled out by adding more of the same IT resources (Virtual Servers B and C).
Vertical Scaling
When an existing IT resource is replaced by another with higher or lower capacity, vertical scaling is considered to have occurred. Specifically, the replacing of an IT resource with another that has a higher capacity is referred to as scaling up and the replacing an IT resource with another that has a lower capacity is considered scaling down. This would include increasing or decreasing CPU, RAM, storage and compute-network resource allocation.
Vertical scaling is less common in cloud environments due to the downtime required while the replacement is taking place.
The table below provides a brief comparison of horizontal and vertical scaling.
Horizontal Scaling |
Vertical Scaling |
less expensive (through commodity hardware components) |
more expensive (specialized servers) |
IT resources instantly available |
IT resources normally instantly available |
resource replication and automated scaling |
additional setup is normally needed |
additional IT resources needed |
no additional IT resources needed |
not limited by hardware capacity |
limited by maximum hardware capacity |
Table 1.2 A comparison of horizontal and vertical scaling.
Cloud Services
Cloud services can be rendered and made redundant through scaling [eg. Elastic load balancers, or auto-scalers].
A cloud service is defined as any IT resource that is made remotely accessible via a cloud. Unlike other IT domains that fall under the service technology umbrella, such as service- oriented architecture; the term “service” within the context of cloud computing is quite broad. A cloud service can exist as a simple Web-based software program with a technical interface invoked via the use of a messaging protocol, or as a remote access point for administrative tools or larger environments and other IT resources.
In the figure below the internal circle represents a cloud service which is a simple Web-based software program. A different IT resource symbol may be used in the latter case, depending on the nature of the access that is provided by the cloud service.
Figure 1.3
A cloud service with a published technical interface is being accessed by a consumer outside of the cloud (left). A cloud service that exists as a virtual server is also being accessed from outside of the cloud’s boundary (right). The cloud service on the left is likely being invoked by a consumer program that was designed to access the cloud service’s published technical interface. The cloud service on the right may be accessed by a human user that has remotely logged on to the virtual server.
The driving motivation behind cloud computing is to provide IT resources as cloud services which encapsulate other IT resources, while offering functions for clients to use and leverage remotely. The acronym XaaS defines the creation of ‘anything’ as a service. Networks, software programs, server instances, agents or any IT related resource can and will be defined to some extent, as a ‘service’ within a cloud network, with some services offered as privately accessible resources and other services, available ‘publicly’.