Let’s happily assume that the migration of data, and of the application has proceeded well, and everything works. The users or clients are happy. It is advisable to now invest time and resources to determine how to leverage additional benefits of the cloud. Questions that you can ask at this stage are:
- How can I automate processes so it is easier to maintain and manage my applications in the cloud?
- Now that I have migrated existing applications, what else can I do to leverage the elasticity and scalability benefits that the cloud promises? What do I need to do differently to implement elasticity in my applications?
- How can I take advantage of some of the other advanced AWS features and services?
- What do I need to do specifically in my cloud application so that it can restore itself back to original state in an event of failure (hardware or software)?
There are over 70 AWS API services. There is no point in going through them all. HA and security are prime issues and benefits in moving to AWS so we can discuss some techniques we can leverage in those areas. HA and reliability services answer the key questions given above.
Automate Elasticity
Elasticity is a fundamental property of the cloud. Implementing elasticity might require refactoring and decomposing your application into components so that it is more scalable. The more you can automate elasticity in your application, the easier it will be to scale your application horizontally and therefore the benefit of running it in the cloud is increased.
You should automate elasticity. After we have moved the application to AWS and ensured that it works, there are 3 ways to automate elasticity at the stack level. This enables us to quickly start any number of application instances when they are needed, and terminate them when they are not needed; while maintaining the application upgrade process. Choose the approach that best fits your software development lifestyle.
- Maintain AMI Inventory
It’s easiest and fastest to setup an inventory of AMIs of all the different configurations. Be aware that AMI versioning (much the same as software code versioning) is mandatory. AMIs are stored in S3.
- Maintain a Golden AMI and fetch binaries on boot
A base AMI (“Golden Image”) is used across all application types throughout the organization while the rest of the stack is fetched and configured during boot time. This golden image is pre-certified for a particular environment (OS, platform and security credentials).
- Maintain a ‘Just-Enough-OS’ (JeOS) AMI and a library of recipes or install scripts
This approach is probably the easiest to maintain especially when you have a huge variety of application stacks to deploy. In this approach, you leverage the programmable infrastructure and maintain a library of install scripts that are executed on-demand.
Figure: Three ways to automate elasticity while maintaining the upgrade process
Auto Scaling Service
There are 2 forms of scaling. Vertical scaling is hardware-based and involves the scaling of an instance or compute-size and throughput. Horizontal scaling adds or removes pools of resources from the existing compute, network and storage capacity, to process an increase, or a decrease in demand. Auto Scaling enables you to set conditions for scaling up or down your Amazon EC2 usage. When one of the conditions is met, Auto Scaling automatically applies the action you’ve defined. This of course is a chargeable service and Auto-scaling needs to be viewed as a specific instance or server resource utilization.
To enact Auto Scaling, we need to examine each cluster of similar instances in your Amazon EC2 fleet and see whether you can create an Auto Scaling group and identify the criteria of scaling automatically (CPU utilization, network I/O etc.)
At minimum, you can create an Auto Scaling group and set a condition that your Auto Scaling group will always contain a fixed number of instances. Auto Scaling evaluates the health of each Amazon EC2 instance in your Auto Scaling group and automatically replaces unhealthy Amazon EC2 instances to keep the size of your Auto Scaling group constant.
Harden Security
The cloud does not remove you from your responsibility of securing your applications. At every stage of your migration process, you should implement the right security best practices. Some are listed here:
- Safeguard your AWS credentials
- Timely rotate your AWS access credentials, and immediately rotate if you suspect a breach
- Leverage multi-factor authentication
- Restrict users to AWS resources
- Create different users and groups with different access privileges (policies) using AWS Identity and Access Management (IAM) features to restrict and allow access to specific AWS resources
- Continuously revisit and monitor IAM user policies
- Leverage the power of security groups in Amazon EC2
- Protect your data by encrypting it at-rest (AES) and in-transit (SSL)
- Automate security policies
- Adopt a recovery strategy
- Create periodic Amazon EBS snapshots and Amazon RDS backups.
- Occasionally test your backups before you need them
Automate the In-cloud Software Development Lifecycle and Upgrade Process
Moving to AWS will by default, change the skill sets and culture of employees and the firm at large. For example, with a scriptable infrastructure, you can completely automate your software development and deployment lifecycle. Finished are the days of the patch-chain; weekend version deployments that go horribly wrong because development and testing environments do not match the production environment; and late or missed deadlines. With AWS we can now manage a compressed life-cycle of development, build, testing, staging and production, by creating re-usable configuration tools, managing specific security groups and launching specific AMIs for each environment. This is termed Dev-Ops (which has merits and demerits outside the scope of this whitepaper).
Automating your upgrade process in the cloud is highly recommended at this stage so that you can quickly advance to newer versions of the applications and also rollback to older versions when necessary.
Post-it Note: With the cloud, you don’t have to install new versions of software on old machines, but instead throw away old instances and re-launch new fresh pre- configured instances. If upgrade fails, you simply throw it away and switch to new hardware with no additional cost.
Create a Dashboard of your Elastic Datacenter to Manage AWS Resources
The management team must have visibility into the ways in which AWS resources are being consumed. The AWS Management Console provides a view of your cloud datacenter. It also provides you with basic management and monitoring capabilities (by way of Amazon CloudWatch) for your cloud resources.
Using Web Service APIs, It’s fairly straightforward to create a web client that consumes the web services API and create custom control panels to suit your needs. For example, if you have created a pre-sales demo application environment in the cloud for your sales staff so that they can quickly launch a preconfigured application in the cloud, you may want to create a dashboard that displays and monitors the activity of each sales person and each customer. Manage and limit access permissions based on the role of the sales person and revoke access if the employee leaves the company.
There are several libraries available publicly that can help you get started with creating the dashboard that suits your specific requirement.
Create a Business Continuity Plan and Achieve High Availability (Leverage Multiple Availability Zones)
Many companies fall short in disaster recovery planning because the process is not fully automatic and because it is cost prohibitive to maintain a separate datacenter for disaster recovery. The use of virtualization (ability to bundle AMI) and data snapshots makes the disaster recovery implementation in the cloud much less expensive and simpler than traditional disaster recovery solutions. You can completely automate the entire process of launching cloud resources which can bring up an entire cloud environment within minutes. When it comes to failing over to the cloud, recovering from system failure due to employee error is the same as recovering from an earthquake. Hence it is highly recommended that you have your business continuity plan and set your Recovery Time Objective (RTO) and Recovery Point Objective (RPO).
Your business continuity plan should include:
- data replication strategy (source, destination, frequency) of databases (Amazon EBS)
- data backup and retention strategy (Amazon S3 and Amazon RDS)
- creating AMIs with the latest patches and code updates (Amazon EC2)
- recovery plan to fail back to the corporate data center from the cloud post-disaster
A business continuity strategy implemented in the cloud is that it automatically gives you higher availability across different geographic regions and Availability Zones without any major modifications in deployment and data replication strategies. You can create a much higher availability environment by cloning the entire architecture and replicating it in a different Availability Zone or by simply using Multi-AZ deployments (in case of Amazon RDS).