Cloud Architecture Design – Best Practices

Cloud Design Goals – AWS as an example.  A system that is truly cloud enabled should have the following attributes:

  • -Multi Tiered [usually 3 tiers]
  • -Multi tenancy
  • -Availability
  • -Reliability
  • -Fault Tolerance
  • -Logging and data streaming
  • -Scaling
  • -Security at each level of the IT stack
  • -Loosely coupled

Objective:

The goal of any great design is a Loosely Coupled Architecture. Individual parts which make up the infrastructure have no knowledge of how the other parts work. They communicate through defined publish services or a Message Bus like a workflow, or Queuing System. Ideally you can replace a system that provides a service with another system that provides a similar service.

 

3 Tiers:

A Multi-Tier Architecture consists of application tiers that are physically separated sometimes along the logical layers of an application such as the presentation, logic, and data base layers. Multiple physical tiers allow more flexibility in scaling based on resource utilization at a particular tier; we’re upgrading a tier without impacting any other tier.

 

Other aspects of a proper Cloud design including the following attributes.

 

Stateless:

A Stateless Systems is a system that is not storing any state. The output of this system will depend solely on the inputs into it. Protocols like UDP are stateless, meaning you can push packets that stand on their own and don’t require the results of a previous packet in order to succeed.

 

Content Delivery/Caching:

Content Delivery Networks usually referred to as CDNs replicate your content to servers all around the world with the goal of improved performance and availability based on the end user’s location. AWS offers a CDN Service called CloudFront with “edge locations” currently in multiple locations on five continents.

 

Network Perimeter:

A Network Perimeter is a boundary between two or more portions of a network. It can refer to the boundary between your VPC and your network. It could refer to the boundary between what you manage versus what AWS manages. The important part is to understand where these boundaries are located and who has responsibility over each segment.

 

Synchronous & Asynchronous Processes:

Synchronous Processing refers to processes that wait for a response after making a request. A Synchronous Process will block itself from other activities until either the response is received or a predefined timeout occurs.

 

An Asynchronous Process does the opposite of the Synchronous Process. It will make the request and immediately begin processing other requests. When a response is finally made available the process will handle it. This is how long-running activities are handled. AWS offers services such as SQS and SNS that can help in the overall implementation of Async processing.

 

Fault Tolerance:

Fault Tolerance is what enables the system to continue running despite a failure. This can be a failure of one or more component to the system or maybe a third-party service. In the realm of AWS this could mean operating a system in Multiple Availability Zones. If an AZ outage occurs, the system continues operating in the other AZs.

 

The goal of Fault Tolerance is to be completely transparent to the users with no loss of data and or functionality. High Availability means having little to no downtime of your systems. The gold standard is 99.999 percent or the five-nines, which means less than five and a half minutes of downtime per year. Not every system has to be built with the gold standard. The availability goals depend on the purpose of the system as well as the operating budget. AWS makes High Availability easy and fairly inexpensive to implement especially when compared to a traditional environment.

 

Self Healing:

Self Healing Systems are capable of recovering gracefully from faults without the need of manual intervention. You can think of Self Healing at both the infrastructure and application levels. Simple Queue Service can be used for reliable delivery of messages between application components. AWS offers tools such as Auto Scaling Groups that can be used to ensure a system is always running. Self Healing is vital to meeting High Availability and Fault Tolerance goals.

 

NAT:

Network Address Translation, NAT for short, is a method of placing all systems on a network behind a single IP address. Each system on the network has its own private IP address. Externally, traffic originating from any of those systems appears as the same IP address. This is how a network that is assigned an IP address from an Internet Service Provider can have multiple systems connected to the Internet resources without each needing to be assigned its own public IP address.

Routing Tables are a collection of rules that specify how Internet Protocol traffic should be guided to region endpoint. A common route in a Routing Table will direct all traffic headed outside of your network through a router. This is how a system can reach web sites. Another route might direct all traffic in a certain range to another network over a Virtual Private Network connection. AWS lets you manage your own Routing Tables for your VPC.

 

NACLs:

An Access Control List, commonly referred to as an ACL, defines permissions that are attached to an object. In the world of AWS you can attach network ACLs to Subnets which will grant or deny protocols to and from various endpoints. ACLs can be attached to S3 Buckets to control access to the objects it contains. ACLs are crucial to understanding how to properly secure your environment. Firewalls are systems, either software or hardware, that control the incoming and outgoing network traffic. You manage a set of rules to permit or block traffic based on endpoints and protocols. AWS implements this via Security Groups that can be attached to one or more EC2 Instances, to Elastic Load Balancers and more.

 

Security Groups:

Security Groups are part of the first line of defense in securing your environment. A Load Balancer works to distribute traffic across a number of servers. It can be a physical or virtual resource. Traffic is directed to registered servers based on algorithms that typically seek and even load or round-robin styled distribution. A client may be directed to different servers on each request.

 

Security, Credentials:

Access Credentials consist of information that is used for the purpose of authentication to systems and authorization of actions that can be performed within a system. Username Password Credentials are the most common type of Access Credentials used today. Another form is Certificate Credentials. AWS uses a combination of credentials ranging from the standard Username Password to Multi-Factor Authentication to Access Keys.

Public Key Encryption is a method of encrypting communications and ensuring that messages contained within the communications, originated from the proper source and were not tampered with on the way to the destination. There are different techniques used in encrypting with Public Keys, each requiring the proper technology support from both the source of the message and the destination.

When two computers wish to communicate using an Asymmetric Key Encryption Scheme, they each share their Public Key with the other. The message originating from the first computer will encrypt the message using the Public Key of the second computer. The second computer would decrypt the message with its Private Key. In fact the only way to decrypt the message is with the Private Key. It is important to protect a Private Key and never share it, otherwise communication can be “spoofed” or intercepted by others.

 

DNS:

DNS stands for Domain Name System; it is a naming-system for accessing resources. It can be used to locate systems in both public and private networks by translating easy to remember names into numerical IP addresses. DNS is an essential tool in any infrastructure. Route 53 is Amazon’s DNS Service.

 

Data Consistency:

Eventual Consistency is exactly what it sounds like. It is a Consistency Model that states, data will eventually be consistent across a distributed service. Eventual Consistency is a major factor in ensuring distributed computing works properly. An example of this model is the Simple Storage Service. Across all regions you may update an object in a S3 Bucket but a follow-up call to access the same object, could show the previous version prior to the update. This is due to Eventual Consistency which guarantees the update will be made but the timing depends on other factors outside of your control. You should know what this means for the design of your systems otherwise this could mean confusion and frustration among your users.

 

SQL / Relational Storage:

Relational Databases are one of the most common types of data storage used in applications today. A Relational Database consists of one or more tables each representing an entity. A table has columns that are considered properties available to describe the entity. Rows in the table represent instances of that entity type. Tables can be linked to each other, for example, an Order Table would link to an Order Items Table. AWS offers a service called RDS. It is a managed Relational Database for different database vendors.

 

NOSQL / Non Relational Storage

Non-Relational Databases can be easily explained as data sources that are not relational. Common within this group are NoSQL Databases that include document databases, Key-Value databases, graph databases, and more. DynamoDB, an AWS offering is a Key-Value Database. RESTful Web Services are HTTP and HTTP/S-based application programming interfaces that interact with other applications through a standard HTTP method such as GET, HOST, PUT or DELETE. The client makes a request to be an URI with any applicable input parameters. The server will process the request and return a response that is consumed by the client. RESTful Web Services have gained popularity because they are considered simpler than their alternatives.

 

Scripting:

A common data format exchanged in these services is JSON. JSON stands for JavaScript Object Notation. It is a human-readable open standards data format that is easily generated and parsed from nearly all modern programming languages, not just JavaScript. If you were not already, you should become very familiar with this format. You should be able to read it, understand what it means and manipulate and write it in JSON syntax.

Security Policies are just one of the many AWS services that are written in JSON format. Without understanding it, you would run the risk of leaving systems exposed.