Many of you will have heard in the news recently about the catastrophic effect that human error can have both on your business and the service you provide your customers. Given how important it is to ensure that your business avoids unplanned downtime, it’s vital to protect against fundamental human error and other simple factors that could lead to outages such as power failures.
With our extensive experience of looking after other organisations’ business-critical IT, it goes without saying that this is something we think about a lot at Datum. So, with recent events in mind we thought we’d share our thoughts on best practices we have observed to eradicate these risks, and ensure that your business is never impacted by simple carelessness:
- Establish full resilience – it should be impossible for one act to bring your systems down, particularly when it’s a mistake. Put backup plans in place including replicas of your set up and copies of your data, as well as layers of resiliency when it comes to things like power and network connectivity to ensure that one failure or error cannot grind things to a halt. Once you’ve done that, test them regularly to ensure that they work.
- Examine all possible single points of power/connectivity failure – as touched on above, you need to make sure that there are no single points of failure within your setup. If power supply or internet connection ‘A’ fails or is interrupted, ‘B’ should always be ready to kick in - ‘B’ being a backup version of essentially the same thing from a different source/provider.
- Check out the claims of potential suppliers – if they say their solution is resilient, don’t just take their word for it – check it out. The risk to the business is too great not to carry out due diligence. Review their accreditations and their track record, examine their maintenance schedules and talk to existing clients to ensure they have a rigorous approach to ensuring uptime.
Prioritise disaster recovery (DR) – it may sound obvious, but you should always have a rigorous disaster recovery plan in place. This plan should be regularly reviewed, tested and updated to ensure that should you ever need it, it will definitely work.
You need copies of everything vital to your operation in multiple locations, as well as layers of resiliency in each location so that whatever the problem, you and your customers can access all important data while the issue is being rectified. You also need to know how, if relevant depending on the issue, you’d make your data readily available and accessible to you and your systems.
- Review the age of your technology – as technology gets older, if it isn’t maintained and DR plans aren’t tested regularly then there’s even more chance it can fail. Too many people are still content to believe that you invest in technology once and then leave it alone - that should never be the case. If you do not maintain and update systems, it’s far more likely that an error or security breach could be caused by a single operator, error or other avoidable point of failure.
For many businesses, the solution to addressing all of this in a cost-effective and easy way is to outsource the management and storage of their data. In fact, when we conducted research into this area with IDG recently, businesses confirmed that DR and business continuity are the second biggest reasons they choose to move their IT to a colocation provider.
You can understand their reasoning; it relieves them of such a burden, putting their trust and the responsibility for these areas in to a provider who is (or should be!) an expert in their field, ensuring that every angle is covered. It’s also a lot more cost-effective, too.
If this is something that you need to be thinking about, download the IDG report and find out why colocation could be the answer.