As the skies cleared and the storm subsided, Sydney licked its wounds, assessed the damage and moved on with its life.
Damage wasn’t limited to physical infrastructure. Many technology systems suffered outages including significant proportions of the cards processing network. AWS, the premier public Cloud provider, suffered a major power loss at one of its three availability zones in the Sydney region, the first major incident since it came online in late 2012. You can put your mortgage on the fact that this incident is getting attention at the highest level within AWS. The truth is though, if you had implemented a solution in AWS that was multi-availability zone or multi-region, your customers would have been unaffected.
It’s a timely reminder to all of us that disaster recovery and durability of your environments and your data needs to be front of mind in everything that you do. This is true in the traditional world of on-premises data centres and co-location. It is also true in the new normal of public Cloud. The difference now is that organisations such as AWS provide the services and capability that your organisations needs to make sure that it can answer the availability requirements of your business. You don’t need to invent and pay for the solution anymore, you just need to design for failure using the suite of AWS services that are available.
It’s easy to throw out the disaster recovery thinking when you decide to adopt a Cloud first policy but the truth is one of the paradigms of the public Cloud IaaS world is that you need to design for failure. Here are 5 things that you need to think about in relation to availability when adopting public Cloud for your organisation.
- Design for failure – Design your solutions to take advantage of the capabilities of the public Cloud to protect against failure – multi availability zones, auto scaling groups, multi-region database services and many others. AWS offers solutions that would have traditionally cost organisations hundreds of thousands, if not millions, of dollars, and all available on a “pay for use” level;
- De-couple everything – Simple is best. Don’t make your environment complex, but simplify by separation and decoupling. Ensure components have minimal dependencies so that they can be restarted, re-created, scaled as required, without the need to re-code or reconfigure;
- Skeletons in the closet – Having a pessimistic view of what can go wrong or will go wrong is no bad thing. Just make sure you balance the “what if?” with the “can do” so that you can still deliver cost effective and timely solutions to your business;
- Data locality – The default position for many organisations is that the data needs to remain “in country”. Is that really true for your organisation or for all the data you hold? If you can loosen that bind, it opens up the opportunity for you to run your workloads in an alternate region in the event of a disaster event;
- Test – Test, test and test again. You wouldn’t jump out of plane without having a lot of faith in the parachute – you’ve got a parachute, right? Similarly, if you don’t test your plan in the event of failure, how do you know it will work? It should be possible to test critical services weekly, and in many cases, daily.
With the public Cloud, it’s never been easier, cheaper and faster to get started in a way that is more secure than ever.
Getting started? If this is not something you’ve done 100 times, call RedBear for ideas, good advice, and a partner for whatever stage you are at in your journey to the Cloud.