BITS AND BYTES
RAMIN KHORASANI, MD, MPH
Business Continuity and Disaster Recovery: PACS as a Case Example The importance of information technology (IT) in our daily clinical activities continues to grow. Many IT solutions are now mission critical in our practices. An important element of successful IT implementation is planning for the unavoidable downtime, whether planned (eg, for service maintenance) or unplanned (eg, because of virus infection). The failover solution can simply be reverting to the old analog processes. Business continuity (BC) and disaster recovery (DR) [1-3] broadly involve failover to a redundant IT solution to avoid the disruption of workflow and to prevent data loss. This is not always as simple as buying two of everything when you buy a product, as I illustrate with specific examples in this column. Using picture archiving and communication system (PACS) core components (eg, database, archive) as case examples, I describe the concepts as well as general functionality required for BC and DR. From a regulatory perspective, the Health Insurance Portability and Accountability Act requires that organizations have DR plans for all medical data. If the primary copy of medical data is destroyed or otherwise unavailable, a secure and retrievable second copy (the DR copy) stored in a separate location would need to be available. Disasters could affect any component of a PACS. Events could disable an individual computer (eg, a workstation in a reading room or a critical server in a computer room), a rack holding many computers, or an entire location (eg, a room or building). Planning for BC and DR should consider every possible failure scenario. Generally, the 144
level of redundancy needed depends on the perceived critical nature of the IT system in discussion and an organization’s tolerance for system failure (fault tolerance). The lower the fault tolerance, the more resources needed to create a higher degree of redundancy. For example, if a PACS operates at a small imaging center and the organization does not depend on its infrastructure for remote reading or image distribution, reverting to film production may be a realistic failover strategy. In contrast, if a PACS provides access to images to an entire health care enterprise (as it could and indeed should), reverting to film production and distribution can be a painful and disruptive experience, and thus a BC and DR approach would be preferred. To minimize the possibility of PACS failure in the second example, one must perform a detailed analysis of the PACS and its core components to identify every possible point of failure. For the purposes of this column, I use my organization’s BC and DR approach to address 2 potential PACS core component failure scenarios: PACS database failure and the loss of the short-term archive. CREATE TWO IDENTICAL SERVERS A PACS database is an excellent example of a single point of failure in many or most PACS implementations. To mitigate the risks of database failure, we have created 2 identical PACS database servers (replicated) and placed them in 2 separate data centers several miles apart. Every transaction in our
PACS is replicated (identically copied) in near real time in both databases. If for whatever reason one database fails, the other kicks in within seconds or minutes. Operations can thus be restored rapidly, allowing BC. LOSS OF SHORT-TERM ARCHIVE An important goal of BC and DR is to prevent data loss. The integrity of medical data in the digital age is critical. When we had a film-based practice, a flood or fire or other natural disaster could destroy the only copies of film we stored. In the digital age, it is possible to keep multiple copies of the same data at multiple locations. At our organization, in addition to the copies of images we keep in short-term PACS storage, we keep 2 other copies at 2 separate data centers in long-term storage (we also mirror or duplicate data at each data center, resulting in 4 image copies in the long-term archive). These redundant data centers are connected with appropriate network bandwidth and security protocols to enable the effective and secure replication of data. The design, implementation, and support of data centers requires substantial planning and resources. Such data centers are thus best used as shared resources in a health care delivery system to meet the needs of various mission-critical IT solutions. The final consideration in this example is the timeliness for reconstitution of the short-term PACS archive by populating it with data from the long-term archive. Expectations and needs
© 2008 American College of Radiology 0091-2182/08/$34.00 ● DOI 10.1016/j.jacr.2007.11.002
Bits and Bytes 145
must be defined so that an appropriate IT architecture and workflow process can be designed to meet DR performance needs. Thus, DR copies of data are necessary but not sufficient for BC and DR. For example, unless hardware is available for new short-term storage (if it was damaged beyond repair), it will not be possible to recreate a functional PACS. There are some key points to remember about BC and DR: 1. It is possible to create an IT environment with BC and DR to markedly reduce the possibility of reverting to analog processes. Cost-benefit analysis will determine which critical systems will benefit from a BC and DR approach. 2. Business continuity and DR require substantial detailed analysis to identify important single points of failure and strategies
3.
4.
5.
6.
to address them. Skilled IT resources will be needed. Creating redundant data centers capable of supporting BC and DR is a big undertaking. Engaging one’s health care organization in this activity is critical. The same resources will be needed to address BC and DR in other parts of the enterprise. Not all vendors have useful strategies for BC and DR. This should be part of product assessment before purchasing decisions are made, when possible. No matter how sophisticated an approach to BC and DR for a particular system, an analog last-resort process should be in place in case everything else fails. A major virus infection could copy itself across a BC and DR solution, just as medical data can. Create a regular routine to practice BC and DR [4] and the last-
resort analog process. The confidence gained in knowing that a failover solution works will pay dividends in the mayhem of major system downtime. 7. Business continuity and DR solutions will cost more than you might think. You hope to never need it, but you will, and only then will you truly appreciate the investment you made! REFERENCES 1. Smith EM. Storage options for the healthcare enterprise. Radiol Manage 2003;25:26-30. 2. Thornton B, Hinkle R. Disaster recovery planning: preventive medicine for information technology. Patient Acc 1998;21:2-3. 3. Cross MA. Planning for disaster recovery. Health Data Manage 1997;5:106-13. 4. Avrin DE, Andriole KP, Yin L, Gould R, Arenson RL. Simulation of disaster recovery of a picture archiving and communications system using off-site hierarchal storage management. J Digit Imaging 2000;13(suppl): 168-70.
Ramin Khorasani, MD, MPH, Department of Radiology and Center for Evidence-Based Imaging, Brigham and Women’s Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115; e-mail:
[email protected].