FEATURE
SSD vs HDD – data recovery and destruction Robert Winter, Kroll Ontrack Robert Winter
A great deal has been written recently about Solid State Drives (SSDs) and their role in enterprise storage. Articles include several comparisons of solid state drives and mechanical drives in RAID arrays for enterprise applications. While most address several key areas of comparison – cost, performance, capacity, power, cooling and reliability – they often neglect to consider the critical areas of data recovery, data destruction and asset disposal. Let’s tackle data recovery first. To understand how a choice of storage can affect the recoverability of data in the event something happens to that storage (and any backups), it pays to take a closer look at how the data from a RAID array is written to physical media. With solid state disks, data passes through the RAID controller to the individual SSDs that make up the array. As the data reaches the individual drives, it is passed to another specialised controller, called a wear-levelling controller. The wear-levelling controller then determines to which NAND chip and block inside that chip the data is electronically written. The location of the data on the NAND chips changes constantly to help protect the NAND chips from wearing out.
Mechanical writing With mechanical disks, the data is passed from the RAID controller to the individual disks. The data is then magnetically written by the read/write head to the platters in the drives as bits. It is important to note that the data is written in a very specific pattern on the platters. Specific bits of data are stored in consistent locations. As an example, when block 10 is written to the platter, barring any defects, the block stays in the same location on the disk platters. The data can then be read from the platter by going back to the same location on the platter and reading the magnetic orientation of the bit stored there. When changes are made to the 12
Network Security
data, the orientation of the bit may change, but its location on the platter does not change. In short, the major differences between the media types are: UÊ iVÌÀVÊÛÃÊ>}iÌVÊÜÀÌið UÊ -Ì>ÌVÊÛÃÊ`Þ>VÊÃÌÀ>}iÊV>Ìð Most established data recovery specialists have had years to perfect their data recovery techniques for mechanical drives and some have very sophisticated methods for dealing with RAID controllers. Parts can be replaced and media damage can be overcome to get access to the raw data (basically creating an image of the data on the physical disk). Once the raw data is recovered, software can be used to virtualise the RAID controller. The data recovery specialist can virtually reassemble the array, then the logical volume can be rebuilt, errors can be corrected and the data can be recovered. Another difference to note is that individual disk failures on mechanical drives are often predictable and data loss can be prevented or minimised. SSD is a newer technology and very few data recovery specialists have the ability to handle the RAID and the SSD layers in order to put the data together in the event of a failure. In some cases, parts can be replaced to overcome failures. To overcome media damage, however, the NAND chips often need to be removed and imaged independently. The raw data bits then need to be reassembled into a usable format, which is much more challenging than simply imaging the disk by overcoming physical/electrical issues or media corruption, as you would
find in mechanical drives. Once that is complete, the RAID is then reassembled, the logical volume is rebuilt, any damage is repaired and then the data is recovered. Individual disk failures are often unpredictable and special care needs to be taken to prevent data loss.
Data destruction Beyond data recovery, a further issue to consider when introducing SSDs is data destruction, which is also more complex in the SSD environment. A good data destruction policy recognises that different types of media require different disposal policies. To understand the differences in HDD and SSD data destruction, we need to look at how data is written to the media. From there we need to examine the different types of data destruction available and the effectiveness of the different methods of data destruction for the different types of media. Data is stored magnetically on traditional hard disks (HDDs). As the read/write heads pass over the magnetic substrate, bits of data are magnetically aligned and oriented in such a way that they can be interpreted as 0s and 1s (binary data). A collection of these bits of data are put together to form bytes which are in turn grouped together in what is traditionally referred to as a sector (usually 512 bytes of data). In SSDs, data is written electronically and not magnetically. This data is stored in pages that vary in size from SSD to SSD. These pages are then grouped together into erasure blocks, which are then zoned together based on the physical address in the flash chip. Data is not written to the pages sequentially – rather
March 2013
FEATURE the data is striped across the erasure blocks and is managed by the wearlevelling controller. When the data stored on the disk is modified, the wear levelling controller moves the entire block to a new location and schedules the original block for erasure. In short, the user has no control over where the data is written and updates to files will more often than not end up in new locations on the media.
Erasure methods With this basic understanding of the way data is written, we can now look at the different erasure methods and their impact on both HDDs and SSDs. Data destruction can be categorised into three methods: UÊ -vÌÜ>ÀiL>Ãi`Ê`>Ì>ÊiÀ>ÃÕÀi° UÊ >À`Ü>ÀiL>Ãi`Ê`i}>ÕÃÃ}° UÊ *
ÞÃV>Êi`>Ê`iÃÌÀÕVÌ° Software-based erasure has been around for a long time and has become more accepted as a method for data destruction as more and more data erasure standards are created and adopted. Developed for HDDs, traditionally this method writes a pattern of data to each sector of the disk in a sequential manner, overwriting the original data and making it unrecoverable while still leaving the HDD functional. This makes software erasure a viable solution for HDDs that you want to reuse. For media that stores data like the SSD, this is not a good method for data destruction. The erasure software is not able to control the specific region the data is written to, as this is controlled by the wear-levelling controller. Arguments have been made that using the TRIM command (used to inform an SSD which blocks of data are no longer in use) or other commands built into the SSD, will ensure that a secure erasure can be performed, but research has shown that these methods are not always successful in removing the data from the drive. So while software erasure is a good solution for HDDs, it does not yet seem to be the right solution for data destruction for SSDs. Hardware-based degaussing has gained traction in recent years as an
March 2013
alternative to software erasure. The prices of degaussers has dropped and the physical units have become better at destroying media. The degausser works by sending a magnetic pulse through the media. For HDDs, this is a very quick solution that reorients the bits on the disk thus destroying the user data and in most cases rendering the HDD inoperable. For SSDs, this is not an effective solution as the data is not written magnetically, but rather stored electronically. The best way to destroy data on both HDD and SSD drives is physical media destruction. This typically involves shredding the media. As long as the process ‘shreds’ the SSD media into pieces that are small enough that a single chip cannot escape damage, this is the ultimate data destruction method. Care should be taken however, to make sure that the shredding is done in such a way that no loose chips end up untouched in the shredded mass. If the chip is not damaged by the shredding process, it would be possible to recover data from it. It is important to keep in mind how data is written to different types of media when developing your data destruction and asset disposal plans. Not all erasure and destruction services work with all of the different types of media.
Planning for a data disaster There are a couple of areas that can make or break an organisation in the event of a data disaster for both HDDs and SDDs. Studies have suggested that data loss costs companies more than $18bn per year and that 50% of companies that have an outage lasting 10 days or more will go out of business within five years, with 70% of them closing within the first 12 months. Understanding storage and taking steps to minimise the impact of a data disaster can greatly improve an organisation’s chances of surviving. Clearly, a good disaster recovery plan is essential in an emergency. A robust plan includes documentation of all of the assets in the datacentres, how systems are configured (including storage,
servers and networking), physical maps of the datacentre, contacts and the plan for recovery, including contact trees, equipment lists, vendor contacts and priority lists. A natural disaster may make it impossible to access a copy of the plan on the server, so be mindful of only having one copy, especially if it is electronic. Multiple copies, both electronic and physical, are recommended. There are free templates on the web for creating a plan and there are some great companies that will do an on-site assessment, map all of your resources and help you create a disaster recovery plan tailored to your specific needs. Every good disaster recovery plan includes a data recovery company that can assist as needed. It is a good idea to work with the recovery company ahead of time and execute all necessary agreements before a disaster occurs so time isn’t wasted when data needs to be recovered.
Understanding your storage It is important to have an understanding of your storage and how the data is laid out before a disaster hits. Find a storage vendor that will sit down and walk through how their storage works, including what type of RAID and file system is being used. As an example, RAID 0 will be faster than RAID 5 but does not include any redundancy. And while RAID 6 allows for a twodrive failure, performance may suffer. It is important to get the right blend of performance and protection in the storage you select. One of the current trends in storage is the dynamic allocation of volumes, where disks are put together in a common pool and then carved into volumes as needed. While this makes the allocation of storage convenient for the storage administrators, it can make data recovery very complicated and in some cases impossible if the maps that allocate the volumes are damaged or overwritten. Another area to consider is whether storage uses traditional, solid state or some hybrid combination of SSDs and HDDs. Storage vendors should treat SSD
Network Security
13
FEATURE and HDD differently to maximise the longevity of the storage, minimise the impact of failures (NAND flash fails more frequently) and maximise performance. It pays to investigate all of the options before choosing a storage vendor. It’s also important to take care when choosing the operating system and file system that are right for both storage and your data. One often sees corrupted volumes because of the file system selected and the data that is written to the disk. Make sure to ask the storage vendor which operating systems have been tested and which are preferred on a particular storage platform. Be wary of proprietary solutions that do not have documentation or real-world testing. Many vendors allow you to ‘test’ a system before purchase, so dedicate time and resources to this test phase to ensure you select the proper storage for your environment.
Backup plan Backups are often one of the most overlooked areas in a disaster recovery plan. We often hear from potential data recovery customers that they back up their data. When we ask if they have tested the restore of their backups, the answer is often “no” or a look of confusion. One of the best programmes seen recently for testing backups is a company that included restore as part of the
strengths and weaknesses in its current setup and catch the little problems before they become much bigger challenges.
compensation plan for its employees. The system admins had to do a bare-metal restore of all of the critical systems in a lab environment on a quarterly basis. Failure to fully restore the system was cause for a verbal warning, a second failure resulted in a written warning, and a third failure in a calendar year was grounds for termination. Successful restores were rewarded with a quarterly bonus, and if all four quarters in the year were successful, an additional annual bonus was given. A good way to avoid a complete data disaster is: first, make sure the backup solution is actually backing up the data that needs to be protected; and second, make sure that the data that was backed up can be restored. Finally, backups should be stored off-site to avoid a natural disaster, fire, flood, etc. Different backup solution vendors approach the backup and restore tasks differently, make sure to try out a backup vendor before you buy to make sure that the solution offered meets your individual needs. Datacentre organisation and design is such a huge area (including power, cooling, physical layout, density, security, etc), and so much has been written about it by specialists in the area that it is not possible to cover it fully here. However, it is worth noting that poor design and documentation can lead to much larger issues in the event of a disaster. A datacentre audit will help identify
Conclusion When evaluating solid state against traditional hard disks, it’s important to take into consideration the possibility of data loss and the need for data recovery. It could mean the difference between meeting the requirements for enterprise applications and disaster.
About the author As chief engineer, Robert Winter is responsible for all operations within the area of disaster recovery in the Kroll Ontrack labs, based at the UK headquarters in Epsom. Winter started his career in the aerospace industry as a design engineer, later moving on to become a risk analysis specialist for aircraft. He joined Kroll Ontrack in 1995 as a lab supervisor, specialising in tape recovery and computer forensics before becoming manager of engineering for data recovery. He is in charge of the day-to-day operations within the labs while also looking after the data recovery jobs that arrive in the Madrid and Paris offices. Kroll Ontrack (www. krollontrack.co.uk) provides data recovery, data backup, data destruction, electronic discovery and document review products and services.
The holistic approach to security Peter Bassill, Hedgehog Security
Peter Bassill
When evaluating the risk of a data or system security breach, it’s essential that one addresses the consequences in a comprehensive way. One of the most significant threats is the devaluation of the business itself and the impact on the company’s reputation. The data breach or technical compromise is actually only the mechanism through which the threat is realised. No-one is exempt from the risk posed by data loss. There have been numerous local authorities that have suffered very publically over the past couple of years. 14
Network Security
The UK Government recently confessed that UK utilities have suffered multiple attacks this year. In the private sector Sony, AT&T and email marketing provider
Epsilon have all been the victims of data breaches. And experience shows that SMEs, small businesses and even micro businesses can all be victims as well. Furthermore, there are also many, many incidents of misreporting. For instance, Amazon was said to have lost data as the result of some form of
March 2013