Monday, July 4, 2011

Natural Disaster Recovery Planning from Christchurch Earthquakes

"It was just a little one - only a four". That was the greeting I received as I arrived at Christchurch airport. As I was flying in, an aftershock hit the city - one of hundreds since the quake that shut down New Zealand's second largest and oldest city. When we spoke with locals, there was a stoic pragmatism. The locals aren't giving up on their city but know that road ahead will be difficult. With 1000 buildings likely to be demolished and access to what's left standing limited there are some significant challenges.

But what can a CIO learn from this? We spoke to a number of CIOs and other executives about the recovery after the 22 February quake.

Motim Technologies CEO, Andrew Plimmer's business was heavily affected. With its main office in downtown Christchurch, the first issue with any sort of recovery came with getting access to his office. "We had 60 minutes to access our building. That's all we had. We had to submit a list of what we were going to get. We were escorted and could only send in two people". Motim's headquarters had been fully refurbished and earthquake strengthened. Despite this, desktop computers were thrown across the office and the contents of bookcases that had been fastened to the wall were strewn across the office.

To get back to work, Plimmer quickly fitted out rooms at his farm property as an office. Most of his staff work from home but meet at the improvised office a couple of times per week. The loss of the central business district in Christchurch has lead to a shortage of office space. However, where companies need office space, the community spirit is stepping up with businesses sharing space.

In the aftermath of the February earthquake, Plimmer was able to contact all of his key customers to assure them that Motim, while affected, had not lost any client data. "All of our systems were operational. We operate systems in the cloud so we didn't lose any data or miss any steps in our projects". Although Motim lost their fleet of desktop PCs they were able to buy replacements and continue working without hurting their clients.

It's important to realise that the 20 February 2011 earthquake that hit Christchurch was an aftershock of a much larger quake. Just five months before, a magnitude 7.1 earthquake hit the city. Unlike the February 2011 aftershock that occurred on a weekday during the middle of the day, the 4 September 2010 earthquake hit the city at 4.35AM on a Saturday - a time when the CBD was quiet. Although there was substantial damage, the depth of the quake, 10km, meant that the impact on the city was not as great as the 4 February 2011 event. However, for some, it served as a wake-up call.

Lee Sinclair, the Client Manager for Technology at the Canterbury Development Council, described the September 2010 quake. "In a lot of ways it was a trial run. It was a Saturday morning. It was disrupting enough but not total demolition. There were some business that had to move three or four times after each quake and some of the other aftershocks". The initial quake and the ensuing aftershocks had savvy operators thinking about the changed risk profile of their businesses. CIOs, aware that the playing field had changed, looked to offsite solutions to protect vital data and systems.

The Christchurch City Council had begun the task of migrating systems and data to a new data centre operated by Computer Concepts Limited. CCL was started by Darryl Swann in the mid 1990s. Swann started the business as a contract Unix administrator "feeding and watering" systems. Over time, the business grew and now employs over 140 staff.

CCL opened a new data centre in Christchurch, away from the CBD, in June 2010. "The Christchurch City Council is one of the big customers that started off in the data centre. The data transfers for that migration, for their last application - we'd done a migration over many months - into the new data centre and new environment finished at 2.00AM on the day of the September 2010 earthquake". That was just two hours before the magnitude 7.1 quake.

What that meant was the council, while significantly impacted by the February 2011 event, did not lose data or access to systems. The CCL data centre receives power from two separate electricity substations, is highly secured and con operate indefinitely off it's own generators as long as they can get access to fuel. Swann told us that they were even planning to secure their own, independent water supply for cooling. Given that Christchurch has now been found to be on the "Rim of Fire" fault-line, access to reliable infrastructure has taken on increased importance.

Other help

New Zealand Trade and Enterprise, the government agency that supports New Zealand companies as they seek to trade overseas, set up a support fund after the February event, initially seeded with $2M, so that businesses could travel to offshore clients to reassure them so that business wasn't completely shut down. This communication with customers was vital in ensuring that affected businesses would not lose customers in the recovery.

In addition, Microsoft New Zealand was able to leverage the support of experts within Microsoft who have been involved in disaster recovery programs all over the world. Microsoft has a dedicated team focussed on providing support on the ground. There's support for license key recovery so software can be installed on replacement systems, priority support for software installation and server configuration and other services that assist with getting businesses up and running.

The Lessons

For the CIO, there are many lessons to learn.

Firstly, managing risk in your business is critical. When the first quake hit Christchurch, the risk profile for all businesses in Christchurch and the Canterbury region changed. Until then, it was not known that Christchurch was situated on a seismically volatile area. Christchurch City Council reacted by moving systems and data to a less volatile area.

Most of the CIOs we speak to apply robust risk management frameworks, such as AS/NZS ISO 31000:2009, when managing projects and specific activities. However, robust risk management practices, including regularly monitoring and managing operational risks, tend to fall by the wayside. Most CIOs run regular meeting with their direct reports. It's time to get risk management back on to the agenda.

A rigourous backup and recovery process is vital. As well as writing, I'm also the manager of a mid-sized business with about 1000 end-point devices. I have a backup system that takes data offsite regularly and maintains local backups as well. However, should a disaster strike - my highest risk is site security related - it's possible that there will be some data loss depending on the timing of an event.

Until recently, realtime, offsite data replication was out of the reach of most organisations. The proliferation of cloud data storage services and SaaS makes it possible and affordable to suffer zero data loss in the event of a total site loss.

If your business is hit by a significant event you are not always on your own. The earthquake in Christchurch mobilised many levels of aid, both organised and ad hoc. Businesses shared offices, government agencies provided emergency access to funds and equipment to maintain operations.

Communication is vital. The CIO of today is not a backroom technician. The technology we manage is what keeps our businesses up and running. That means making sure that the business - your internal customers - understand what is going on. I spend a lot of time with my team discussing how we communicate with our stakeholders. Establishing a communications plan as part of your business continuity and disaster recovery planning is critical. Knowing who will be delivering what messages before an incident occurs means that nothing will be missed in the heat of the moment.

Finally - practice. Having established a business continuity and disaster recovery plan, it's important to practice the plan. Set aside time at least once a year, devise a reasonable disaster scenario and practice your recovery processes, communication protocols and incident management strategies. As well as validating the appropriateness of your processes, you'll have had a dress rehearsal so that, in case disaster strikes, you and your team will have had practical experience with the plan.


This was first published in June 2011


View the original article here

No comments:

Post a Comment