Saving the day

Symantec's Anthony Harrison reveals five strategies for enterprises to achieve cost-effective business continuity and disaster recovery.

  • E-Mail
By  Anthony Harrison Published  January 31, 2009

Symantec's Anthony Harrison reveals five strategies for enterprises to achieve cost-effective business continuity and disaster recovery.

New technologies are available to help enterprises better achieve business continuity in the chaos and devastation that natural and man-made disasters leave in their wake.

This disaster recovery (DR) strategy can also be an extension of the local high availability (HA) solution the organisation already has in place and can address causes of downtime like user error that most IT managers rarely think about when devising their HA/DR plan.

With clustering software, administrators can manage both physical and virtual system resources from a single graphical user interface.

Automated solutions for configuration management, clustering, provisioning, and server virtualisation are available now, making secondary datacentres a cost-effective option that can operate in a streamlined, economical manner.

In addition to automating failover and recovery functions, these same tools can also help administrators meet stringent system availability requirements by helping them to minimise downtime required for their systems.

The following five strategies can help enterprise IT organisations implement robust high availability and disaster strategies that have the potential to maximise system availability for day-to-day operations.

Strategy One: Solve problems faster

Traditionally, one of the key challenges in executing timely disaster recovery was a delay in alerting IT staff to an outage, and subsequent problem diagnosis.

Advanced clustering technology notification and reporting capabilities can pinpoint when an outage occurs, and immediately notify administrators of a problem. Clustering technology then takes immediate action by starting up applications at the secondary datacentres and connecting users to the new datacentre.

Administrators can then use configuration management tools to diagnose the cause of the downtime, such as identifying a change that may have been made by another administrator.

The tools can display the nature and time of the change, speeding problem identification and resolution. When the change is reversed, the normal operating environment can then be restored. With configuration management tools, datacentre administrators can be confident that their systems can prevent similar outages in the future.

Strategy Two: Automate recovery processes

For many organisations, system recovery is a manual process. It often requires time-consuming troubleshooting to identify and solve the problem, and then administrators must rebuild the infrastructure step-by-step, including restarting servers, installing software, mounting data, starting up and configuring the software, and reconnecting users to the secondary site. Pressure builds on administrators as time, revenue, and customer loyalty slip away, and the potential for human error rises.

An automated approach, such as high availability clustering, eliminates vast amounts of downtime compared to a traditional manual recovery process. If a system fails in the primary datacentre, the software can restart the application automatically on another server.

The administrator may be notified by a text message or an email, and has visibility into problems at all times, but the series of activities required to maintain business continuity is handled by the software with limited action required by IT employees.

If a disaster threatens to cripple an entire datacentre, an automated approach can eliminate human error and reduce downtime by triggering failover of the critical applications to the secondary site. The failover solution should determine which replicated data the applicatio needs to continue operations. Then a single click starts an automated procedure that restarts the application and connects the users to the secondary site.

Automated failover also addresses a common weakness in many disaster recovery plans - the assumption that key employees will be available to physically enter the datacentre and manually restart applications. If the employees are unavailable, business continuity suffers. Automation helps reduce this potential point of system failure.

Strategy Three: Test your DR plan

Recent studies have shown that few companies test their DR plans on a regular basis, and as a result, most companies have little faith that their DR plans will work when needed. Companies have been reluctant to conduct DR testing because testing often involves bringing down production systems, mobilising a large segment of the work force, thus taking them off of more urgent projects, and forcing employees to work during inconvenient hours such as weekends or nights.

With automated failover capabilities, IT organisations can test recovery procedures using a copy of the production data - without interrupting production, corrupting the data, or risking problems upon restarting a production application.

This capability means that tests can be run during business hours instead of over the weekend, reducing staff overtime. As an added benefit, automated tests run during peak production periods can recreate and approximate the conditions that would occur during a true failover situation.

Configuration management tools can also give more confidence to IT managers that their DR plans will work by ensuring that servers at DR sites are consistent with those in production sites. Server builds change over time as patches are implemented or as application dependencies change. This can prevent clustered servers from working properly, as stand-by servers may have not received the latest patch or configuration updates.

The latest configuration management tools can run consistency checks that will alert administrators that servers have drifted from the standard build. Action can then be taken to make the appropriate changes and ensure that HA/DR technology will work when called upon.

Strategy Four: Extract value from secondary sites

For most enterprise IT organisations, secondary sites are viewed strictly as cost centres, sitting idle much of the time. New advances in server provisioning software allow more value to be extracted from secondary sites, enabling them to be used for test development, quality assurance, or even less critical applications.

If the primary datacentre goes down, administrators can use provisioning software to automatically reprovision server resources to match the production environment.

Advanced clustering software also reduces the high cost of the traditional condition that applications must be failed over to the identical hardware that the production applications run on. The most sophisticated clustering software permits failovers between different storage and server hardware within a datacentre or at remote sites.

With the flexibility to dynamically reconfigure and reallocate resources, the secondary site becomes a resource that can be used for multiple purposes the majority of the time, but can be quickly reverted to its backup designation when needed. This underscores the value a secondary datacentre can deliver.

Strategy Five: Achieve high availability and concurrent disaster recovery in virtual environments

Server virtualisation has become mainstream technology in today's server-centric datacentre. Server virtualisation employs virtual machine technology that allows multiple operating systems to be run on a single server, each functioning independently of the others with its own operating system.

Restarting virtual servers at secondary sites has traditionally been a manual process, requiring personnel who may not be available during an actual disaster. New clustering software allows companies to deploy server virtualisation technology and receive the same automated disaster recovery benefits they can expect in their physical server environments.

Furthermore, new high availability and disaster recovery tools are available that reduce the complexity of protecting and managing both physical and virtual server environments. With clustering software, administrators can fail-over applications from physical servers to virtual servers, and manage physical and virtual resources from a single graphical user interface.

The result is that, through effective management of physical and virtual servers, hardware costs can be significantly reduced.

Anthony Harrison is the manager for systems engineering at software specialists Symantec.

Add a Comment

Your display name This field is mandatory

Your e-mail address This field is mandatory (Your e-mail address won't be published)

Security code