Disasters Don’t Always Happen In One Hit – Are You Prepared For Escalation?
When we think of this day in 2001, everyone’s mind naturally turns to the devastation caused across the world. A day that caused irreplaceable amounts of human loss; and on it’s 12th anniversary I’m reminded of the effect the disaster took in wider terms. When I remember the days of trauma that followed I also remember our struggles to support the business and keep it running. Our biggest challenge was Diesel – but what does that have to do with IT?
In an emergency the data centre will run on backup power from diesel generators – it will last a day. Most business people will not even be aware of this kind of fail-over; in the worst case the data centre will shut down. Planning for orderly shut down (see our previous blog) is an essential part of BCP in the real world. In practice, catastrophic failure is unusual, there is often a long protracted disaster in the making.
Take 9/11: possibly one of the hardest to manage periods in recent history for many CIOs. Many organisations like the one I was managing found themselves executing real emergency plans. Not only did they have to cope with keeping their critical systems online and their business operations running (probably with people dialling in from home on home access) but they had to keep the systems alive in the data centres to do it. Most banks and institutions have their data centres well away from their offices, so losing the office site (in theory) should not be an issue for the core systems, but in 9/11 the power cuts went on for days and road closures prevented access for people and teams. The biggest issue we faced was running on diesel… in fact we think we were within twenty minutes of losing all power from the emergency generators after four days of running only on backup power. To extend the life of the generators we had managed a similar shut down of non-essential systems in a hurry to conserve fuel and reduce load, extending a days diesel to nearly four days. Even having two data centres did not help – both were in the same state in this wide spread emergency.
The moral of the story is that BCP planning needs to be taken seriously. It is not usually a loss of one site; it can be anything from a few systems or servers, a few racks to a full disaster for days like 9/11. Ask serious questions about plans for all scenarios, in practice organisations are good at small scale failure like one or two servers, as this happens periodically. They plan for the big one loss of one site but reality is often more complex and in between the extremes.