Don’t panic

Many enterprises still shy away from business continuity and disaster recovery planning, despite its vital importance to the business. ACN asks industry experts for tips on the best ways to start planning or revising a business continuity plan.
Don’t panic
By Eliot Beer
Sun 08 Jun 2008 04:00 AM

Many enterprises still shy away from business continuity and disaster recovery planning, despite its vital importance to the business. ACN asks industry experts for tips on the best ways to start planning or revising a business continuity plan.

Business continuity is always a tricky issue for enterprises to face, requiring as it does a long, hard look at the innermost workings of an organisation, and a painstaking analysis of where everything could go wrong.

However this analysis is now more vital than ever - in an on-demand world, few, if any, businesses are able to survive significant downtime or data loss.

"The biggest problem is not getting information - the biggest problem is getting started, and having enough manpower available for that starting point.

This is where psychology kicks in - we still see a lot of CIOs and also business leaders who just don't want to know about it, who feel they can get away with not knowing about business continuity issues - which obviously doesn't work, but you'll still find a lot of that way of thinking: don't tell me about it, because I can't do anything about it anyway," says Dr Hannes Lubich, head of BT's security practice in EMEA.

"The easiest way of starting is getting through that layer and saying yes, we have to know about it - it's easier and better for us to face it, even if there's some unwanted truth in this, even if we see something that we don't like and wouldn't tell our shareholders. But you still have to know what it is," he adds.

Lubich's thoughts are echoed by Omar Dajani, regional manager for systems engineering at Symantec: "From my perspective, talking to customers, the main factor is the perception that it's highly complex.

Generally, we want to resolve the simpler issues first - so if a server's not performing very well, let's put our resources behind making that server perform well; we want to increase market share, so let's introduce some new infrastructure.

There's a feeling that it's so complex that somehow it just gets delayed - things are running fine now, so why the sense of urgency?

"In reality it's not that complex - it doesn't need to be driven by high-end technology.

Sometimes I sit down with IT managers and say: ‘Let's just sit down with a blank sheet of paper and have a couple of hours' workshop. Let's just take a look at your applications and data, and prioritise them,'" explains Dajani.

When it comes to the differences between the Middle East and the rest of the world in approaches to business continuity and disaster recovery, views differ on how marked these are in reality.

BT's Lubich sees the same problems in the region as the rest of his EMEA territory; on the other hand, Dajani, while acknowledging a similar end result, sees very different reasons for the causes.

"In the West, there's a lot of old infrastructure, and they're trying to maximise the benefit before the move to new hardware.

There's an awareness - there are more C-level people that sit on committees, there's more processes and procedures - there's more of an awareness and more of a sense of urgency in the West, but the delay's based on a lack of investment in new infrastructure.

Whereas in the Middle East, we do have the modern infrastructure, but there's a lack of awareness - we're a few years behind in our knowledge of what disaster recovery is. So we're catching up, but I don't think we're catching up fast enough," Dajani states.

This is echoed - albeit with a more positive sentiment - by Thomas Luquet, business development manager for NEC's EMEA Enterprise Computing division: "What we can see in the Middle East, from customers who have a strong interest in BC/DR, is that companies here are very open to the latest new technologies.

There's often no existing DR infrastructure, and it's very different from other countries and regions - customers want the very latest and best technologies to go to disaster recovery."

Testing timesAlmost one in two tests of disaster recovery plans fail, according to Symantec research, suggesting that many plans out there will not be effective if used for real - which 48% of plans are, again according to the vendor's research.

The reasons for failure are broken down as follows:

• Technology does not do what it is supposed to: 22%

• People do not do what they are supposed to: 19%

• Disaster recovery processes turn out to be inappropriate: 18%

In more positive news for the Middle East, the region is ahead of the curve when it comes to frequency of testing, with companies questioned giving their plans a run through every 4.4 months, compared to 8.1 months for the USA, and 10.2 months in Germany.

The Middle East had mixed results for disaster recovery times - the average estimate to recover skeleton operations after a fire was 1.3 days, one of the longer timescales (US average was 0.3 days).

To re-establish complete operations, Middle Eastern companies estimated they would need 5.9 days - dramatically less than the 30.6 days for the US and 51.6 days for the UK. This is most likely due to Middle East respondees being smaller or less complex than their counterparts elsewhere.

What do you want?

It may sound simple, but the first critical step in planning BC or DR procedures is to decide what is needed in the event of a disaster: "You ask what is the application that brings you revenue; what is the application that, in the event of a total system failure, you want to bring up first; where is your data located; how much data do you have," as Dajani puts it.

There is a significant difference at this point between regulated industries such as those in the financial or health sectors, and other less tightly-controlled sectors.

As BT's Lubich points out - from his Swiss, and therefore heavily regulated, base - in a bank, the first step is to speak to the Chief Risk Officer or equivalent executive, and follow the steps.

"However, in a non-regulated industry it's different, because you usually don't have those roles," says Lubich.

Then you will have to find the business stakeholder and do scenario planning with them: tell them to close their eyes and imagine the IT room isn't there, and ask: ‘How much of your business can you sustain, and for how long? How much do you need by when, and how much are you willing to pay for it?

But then it becomes more difficult, because you're missing that pressuring element of compliance, and being able to prove that you're resilient in your IT operations and business operations."

What can you do?

Another major issue an executive charged with delivering a workable BC or DR plan needs to be aware of is their organisation's ability to deliver on that plan - something which is easier said than done.

"The problem that we're all facing is that everybody built their BC planning for DR, but they can't know what they've invested in is 100% reliable. There is nothing 100% reliable, at least going up after 99%.

But they'll only be aware of any issues when the disaster occurs - and then they'll see that they're not totally covered for business continuity. They may be providing DR - but it's much harder to provide BC," says Tony Achkar, general manager of Sybase Middle East.

Being realistic also extends to planning for the disasters that might strike - or, as BT's Lubich puts it, not planning: "A major mistake that I see many companies doing is very extensive scenario planning.

You can spend man-years on 20 or 30 scenarios, none of which will happen to you in that form - it's the next one, that you didn't think of, that will happen next.

You can also look at the amount of time and effort that goes into that kind of planning, and start asking if it still makes sense, or can we reuse those resources to do proper bottom-up planning if we can't do it top-down."

Instead of detailed plans for lots of scenarios, Lubich recommends planning for eight or nine broader scenarios, based on past "near misses" the company has had - however uncomfortable it may be to revisit them. He also recommends testing out these scenarios regularly, but adding an unexpected twist each time.

Banking the dataHabib Bank AG Zurich (HBZ) has implemented a new disaster recovery system from Sybase, which mirrors the bank's data from across its global branches to a disaster recovery site in Switzerland.

"Any business, especially banks, want to ensure 100% availability of their data in their systems. Habib Bank wanted to replicate data to Switzerland and other territories as well, to provide complete business continuity - this is for the database application, not the whole bank," explains Sybase's Achkar.

"Habib Bank decided to go for an early-adopter product - Mirror Activator - which includes replication capability as well, and provides near-real-time availability of the systems in the disaster recovery site, and reduces bandwidth - incredibly important for any business. So only the changes in the data are transmitted within the disk mirroring solution," he adds.

"In the event of failure, our data is very important - in the event of failure, all our data has to be mirrored in a remote site. Zero loss - everything has to be there. Sybase Mirror Activator was a perfect choice because it easily satisfies our disaster recovery and back-up needs.

It meets all our requirements regarding enterprise data replication and guarantees that data at the disaster recovery site is real-time, synchronous, integrated and always available," says Haja Alavudeen, executive vice president and head of IT at HBZ.

Who can help?

Especially if BC planning has been left in the lap of the IT department, getting business stakeholders is critical to success - but paradoxically difficult to do. While, as with other security issues, executives may be tempted to use scare tactics, these may not be as effective as they were in the past.

"Scare tactics were the most commonly-used tactics five or six years ago, and we've come to the point where most people are scared enough, so it doesn't really help to scare them more - they just drop the topic altogether.

The ones who want to know, they know - and the others, they can't be scared by anything, obviously. So scaring people doesn't work for security, it doesn't work for BC," says Lubich.

"Now it's basically either showing them the value of compliance, or it's a financially-driven discussion, where you say: ‘You can invest in these countermeasures, or we buy proper insurance, or we'll just cover it from our own expenses if it happens, because we decided to do so.'

The most important thing is to find somebody in the company who takes responsibility for the topic, and brings it up to decision level," he adds.

The risk of getting high-level business decision-makers involved is that they may come back and decide they do not want to implement or upgrade a BC plan. In this case, says Lubich, "the only protection that you still need is that you want this in writing, part of the minutes of the board meeting".

Don't panic

Where everyone agrees is that the biggest mistake business can make with BC planning is not to do it. The head-in-the-sand approach carries no benefits - even an analysis that reveals the company is going to suffer badly in the event of a disaster at least allows business decision-makers to factor that into their plans.

At the same time, it is important not to get bogged down with - and concerned by - the large amount of technical jargon around the subject. Although BC plans can be - and grow to be - extraordinarily complex, this doesn't have to be the case for a plan to be effective.

"When people hear disaster recovery they think of a secondary site, replication, terms like recovery point and recovery time objective - but when you start having these discussions with them, it's not that complex; it's more straight-forward than many think.

So my philosophy is to start; at least begin the process," concludes Dajani.

White House white washRecent disclosures about the Office of the President of the United States of America reveal that even the world's most powerful and (in theory) well-resourced people are not immune to complete and total failure when it comes to data archiving.

A memo from the Congressional Committee on Oversight and Government Reform reveals a catalogue of errors, starting from 2002, when the Bush administration decided to ditch the older Lotus Notes e-mail system and move to Microsoft Exchange, in the process also retiring the Automatic Records Management System - and moving to an "ad-hoc" manual archiving system, called "journaling".

It seems this was only meant to be a temporary fix while a new archiving system was decided on and deployed, according to the memo: "Carlos Solari, who was the Chief Information Officer for the White House at the time, described the journaling process as a ‘temporary' solution, and as a ‘short-term situation' that was not considered by the White House as a ‘good long-term situation'."

Unfortunately, the replacement system - known as ECRMS (electronic communications records management system) - was never actually implemented, despite consultant Booz Allen Hamilton and integrator Unisys being paid to design and create the solution, which was "ready to go live" on 21 August 2006.

The reason given was that the system would take too long to process the backlog of e-mails. The previous CIO - Solari - was reported to be "puzzled" by the decision not to implement ECRMS.

To date the White House is missing large portions of its e-mail records, and the Committee has had to subpoena material to gain access.

For all the latest tech news from the UAE and Gulf countries, follow us on Twitter and Linkedin, like us on Facebook and subscribe to our YouTube page, which is updated daily.

Subscribe to our Newsletter

Subscribe to Arabian Business' newsletter to receive the latest breaking news and business stories in Dubai,the UAE and the GCC straight to your inbox.