If you are in Calgary, you probably heard about Shaw building fire and all the servers going down causing major blackouts for Alberta Registries, Alberta Health Services, City of Calgary and other organizations, because their servers happened to be in the building.
That tells us, that everything can happen, and it’s a matter of time until you are affected one way or another. You may have super-cool and super-powerful server, but if the data center is cut off the grid, you are down.
Similar issues were happening couple of weeks ago with the US blackouts, and again, some major companies got affected.
This brings us to the point of disaster planning for your online assets. Here are a few important points to consider
Website recovery after disaster
Doesn’t matter what size of the website you have, it’s easy to have full data backup. Make sure there is automatic backup done on all your data (files, databases, etc). Plus, make sure your data is not stored on the same server / hard drive, but transferred to the different location. Off-site backup worth every penny spent, and it’s not that expensive. Just remember – it must be done periodically (depending on data type go with weekly or daily backups) and don’t rely on employee downloading your files, make sure it’s automatic. If you have your data, you will be able to get up and running pretty fast even if your server exploded. There will be downtime while you setup new server / restore website, but it’s usually just a matter of hours.
When having your website up in a few hours is not good enough for you and your users, make sure each little piece of your system comes with redundancy.
Hardware fails, it’s question of time, so make sure everything is duplicated. If you rent a dedicated server and company has tech guys on site, it still takes some time to diagnose and fix equipment. So, if you are serious about uptime – you need to have at least two servers running and sharing load. This is also a potential opportunity to locate second server on the other end of the country to avoid electrical blackouts, tornadoes, fire and other disasters. The cost for this kind of setup is going to be quite noticeable, but it’s your call to make a decision of how critical it is to have your system up all the time.
Good news is that with the growing popularity of cloud services and virtual servers you have easier and less expensive ways to have additional servers running around the world. Big cloud servers providers also will make it easy for you to launch more servers on demand if you need them.
Another important thing I learned (hard way) is that even if you have servers ready, data backed up and everything in order, it doesn’t mean there are no problems. There were situations when backups were missing some critical files because files were stored somewhere else. Another example is that backups were not executed on schedule (for a variety of reason), so when problem happened there is hard discovery that backups are actually old. Secondary servers fail to switch over on failure. Employees not knowing how to perform recovery. List goes on and on. So, run a fire drill – try to recover backup once in a while. Or switch to alternate server. You may be surprised to discover some key pieces missing or unreliable. It’s a good idea to do it once in a while just to confirm everything is working and up to date.
Lesson summary and action plan
Even if you have your website hosted in the state of the art data center, something may go wrong. Take a moment to make sure you know where your data is, what will you do if data center goes offline, how it will affect your business and what can be done to eliminate extended downtime.