What is Hybrid Cloud?

The IT Industry likes to make up and appropriate terms. We however are not good at agreeing on what they actually mean. This leads to heated discussions and crazy assumptions, all because we are talking about different things, since we didn’t actually define the term.

So with that in mind, I am going to coin my definition of a term, so that when I talk about it at least you will know what I mean, well hopefully.

What is Hybrid Cloud?

When I talk about Hybrid Cloud I am really talking about using a combination of On Premises Infrastructure (e.g. located in my DataCenter, or in a Co-Location facility where I rent space), and also using Public Cloud (e.g. Hyper Scalers like AWS or Azure) at the same time.

This doesn’t need to mean the same application/workload, nor anything else particularly clever. It is just about making the most of the platforms you have available to them.

So how do you then decide what workload goes where?

This is the real challenge. A lot of organisations decided that it would be easier to go all-in e.g. Pick up all their On Premises workloads and ‘Lift and Shift’ them to Public Cloud. When you do this without re-factoring (Re-writing how they work and interact with the infrastructure) your applications to take advantage of the environment you are running in, it tends to cost you a fortune, and doesn’t give you any of the benefits.

So really it comes down to a few key considerations:

Technical Benefits
- Scale - both up and down
- Geographical Benefits - Locating infrastructure closer to users
- Specific Technical Capability - Consuming a service rather than building it myself
Cost Advantage
Regulatory Requirements

If you already have the workload running On Premises on Infrastructure that is paid for, in a Datacenter that you have already costed, and there is no cost or technical advantage to move, then you probably shouldn’t move it. If however, you need to get rid of your Datacenter or replace your infrastructure, then you should factor that into your Cost Advantage category, and look for ways to leverage the technical benefits - which will likely get you a Cost Advantage by doing things in a ‘Cloud Native’ way.

Should I always design for cloud?

When you are moving a workload to cloud, you should always design the workload for that environment. This is because you are going to have a number of things that are quite different about Public Cloud than On Premises environments. The biggest one that is frequently overlooked is that of availability of underlying infrastructure.

When someone builds an On Premises infrastructure, traditionally High Availability is part of the underlying infrastructure design and is abstracted away from the apps being deployed.

As an application owner I will probably deploy something like a 3-tier app. A 3-tier application typically consists of 3 layers:

Presentation Tier - Typically some kind of Web Front End that users interact with.
Application/Logic Tier - This is the main code of the application and typically processes and logic
Data Tier - This is typically a Database that contains all the data.

Using a VMware deployment as an example, it might look something like: 3 Tier On-Premises Diagram

In this diagram we have effective the bottom two layers which are phsyical hardware, the bottom being some kind of shared storage which contains an Operation System (OS) disk (the configuration of the compute and code to run it). The Data Tier also has a database disk, which contains the data. The top layer is logical, in the sense that this is the machines operating, running on top of a physical host in the vSphere cluster.

In this scenario, if there is a physical failure of one of the hosts, the virtual machine (VM) will restart on another one of the physical hosts and continue operating.

If you use some kind of storage replicaiton of the bottom layer, and have another set of infrastructure at another site, then you can even restart the VM at the other site - giving you a DR capability to protect aganist site failure.

If I was to deploy the same 3-tier application in AWS, it would look more like: 3 Tier AWS Diagram

In this diagram, we don’t even look at phsyical equipment. We have a region, and in that region we have availability zones (AZ). An AZ is similar to a datacenter in the physical world, so by using multiple AZ’s we are gaining something similar to our across datacenter DR from above - but here we treat it more as an availability measure.

The top layer is the Web/Presentation Tier. Unlike the VMware On-Premises example, disk isn’t considered the identity of the system, intead we have an image, which we can create multiple copies of, in the form of EC2 instances (the AWS equivalent to VM’s above) in response to load or availabiltiy measurements. This means we can scale both up and down in response to load, and automatically replace instances that have failed, which we check with tests. We call this an auto scaling group and it spreads across two or more availability zones. In front of it we have an elastic load balancer which handles distributing the load between the instances.

The middle layer operates in the exact same way as the top layer, but the image used to spawn the instances is the application server image.

The layer at the bottom is a little different, we have fixed images here, as we need to run a database. Because the data needs persistence, we can’t just spin up and down images at will. So we have one instance in each AZ, and they run active/passive meaning basically only one instance is serving data, and the other is waiting to take over. Both have an up-to-date copy of all the data.

That is only if I don’t want to change too much and totally refactor.

If I don’t do the autoscaling, load balanced cross Availability Zone thing (which people often forget), then when one of my EC2 instances fails… which it will… or when an AZ has an outage, which happens often enough, then my App is no longer accessible.

Basically, you have to do the same kind of work to make your Infrastructure robust and highly available that you would do in your On Premises environment - but it is easier to do if you are an infrastructure person. Unfortunately though, frequently this has to be done by an app person that doesn’t always have the background or understanding of the techniques, since they are used to just deploying their apps and letting the Infrastructure team handle the rest.

Can I just move the stuff that makes sense?

If you get the networking right, you can start to move just the things that make sense. Using my 3-tier web app example above. I probably still want to do Web and App Tier as I did above, load balanced and autoscaling, since they are consumer facing (say my internet banking front end). But maybe I have to keep my Database on-premises since my regulator says systems of record can not be moved to public cloud. Now with the right networking I get the best of both worlds. Internet facing, auto-scaling front end that is modernised, connecting back to regulated existing data back end. In this case you need to put the work in to securing it all the way through and also ensuring resiliency in the environments all the way from cloud to your existing backend. To me this the true value of Hybrid Cloud.

But what if I want to use more than one Public Cloud provider at the same time?

This is what I like to call a Hybrid Multi-Cloud Environment. With a properly connected infrastructure extended into public cloud you are able to start to imagine an infrastructure where you can use what you want when you want. You have the capability to scale your front end in whichever Hyperscaler has the lower price at that point in time to handle transient workloads, or leveraging a new Machine Learning algorithms that has become available in one of the providers. However there are challenges in doing that.

If you like what you have read above and are interested in more about Hybrid Multi-Cloud and how to actaully try to make this work then come check in every couple of weeks. I intend to go through more theory and also some practicle articles on how to get this all working utilising more than one cloud provider.