Disk can be integrated into a data protection infrastructure in two basic ways: disk-as-tape and disk-as-disk.
Disk as tape
Disk-as-tape approaches make disk devices appear to be tape devices to the outside world using a virtual software layer. Backup software can use these "virtual tape" targets just as if they were real tape devices, but they have all the performance and reliability characteristics of disk.
The biggest advantage of virtual tape libraries (VTLs) from vendors such as Data Domain, FalconStor Software, Hewlett-Packard (HP) Co., IBM Corp., Quantum Corp. and Sun Microsystems Inc., is that they easily integrate into tape-based data protection infrastructures with minimal disruption to existing processes. Since the disk appears as tape, it does not need to be provisioned like disk, saving significant administrative overhead. All of the backup software's catalog and tape management functionality can still be used to manage the backups as they move through the data protection hierarchy. The data can still be offloaded to physical tape for disaster recovery purposes if desired.
If at least the most recent backup is always retained on disk, then you can enjoy all the benefits of disk for faster, more reliable restores. VTLs make sharing tape devices between multiple servers and applications easier, too. Since you don't need special support in the software for tape device sharing, you can partition the VTL into multiple smaller VTLs, assign a certain number of virtual cartridges to each VTL and associate each VTL with a different backup server. Many VTLs also support replication, providing a fast, secure way to move backup data between sites without the risk or time lags of physical transport.
Unfortunately, VTLs can add significant costs that vary depending on how your backup software vendor licenses VTLs and prices their products. Typically, features such as storage capacity optimization and replication will add to the cost of the product.
However, storage capacity optimization may reduce the overall cost of VTL-based storage, depending on the data reduction ratios achievable with each vendors' technology. Also, while replication increases costs by requiring a second VTL at the remote site, and network connections between sites, it eliminates physical tape transport costs. It also provides a secure transport method and significantly improves RPOs and RTOs in the event of a disaster.
If your backup process includes copying tapes and ejecting them for offsite physical transport, then the use of a VTL may force some modifications. VTL tapes obviously can't be ejected, but you can solve the transport problem through the use of replication. You can also choose to copy your virtual tapes to physical tapes, and then transport them offsite. Adding a VTL may also introduce an additional vendor and associated support coordination issues, although you can minimize this issue by buying your VTL solution from your existing disk or tape vendor.
Disk as disk
Disk-as-disk approaches, such as disk staging, and disk-based backup repositories, present their own set of advantages and disadvantages. Disk staging, often referred to as disk-to-disk-to-tape (D2D2T), assumes that while backups occur directly to disk, they will quickly be migrated to tape after they have finished.
The backup software views these disk-based backup "landing pads" as disk, so it must be provisioned as disk. However, if the disk staging area is small, the provisioning overhead can be minimal.
Certain backup products, such as IBM's Tivoli Storage Manager (TSM), assume disk staging as the default. Disk-to-disk (D2D) targets assume that while backups occur directly to disk, they may or may not be migrated to tape.
Most other backup products, including commercial offerings from Atempo Inc., BakBone Software Inc., EMC Corp. and Symantec Corp., as well as open-source offerings from Zmanda Inc., implement a D2D approach.
Disk-as-disk targets can be connected to a backup server using DAS, a SAN or NAS. NAS approaches support sharing of disk-as-disk targets among multiple backup servers for better utilization.
On the plus side, when backing up to disk, you get all the performance and reliability advantages of disk for both backup and restore. You also have the option to employ any of the previously referenced disk-based technologies to further your own goals for lowering storage capacity costs, speeding backups and restores and distributing data to remote locations.
However, there are several potential downsides with disk-as-disk approaches. First, it imposes all of the provisioning issues that you deal with on primary storage on your secondary storage. You now have to spend time associating volumes to file systems, disks to RAID groups and RAID groups to servers.
SAN-based targets can offer the higher performance of a Fibre Channel-based disk array, whereas NAS-based targets ease some of the provisioning issues by putting all of a backup server's disks into a large, shared file system managed by a NAS filer. Scale out NAS backup solutions that deploy clustered file system products, such as ExaGrid Systems Inc. or Exanet Inc., can further ease provisioning issues by not requiring a volume for each backup server.
The way file systems were designed to operate means that disk-based file system backups will become more fragmented over time. This leads to degraded performance unless regular defragmentation is performed. Also, tape-centric backup software does not always know what to do when a file system on a disk-as-disk target fills up. It often aborts the backup and requires additional management oversight.
It's clear that integrating disk into your data protection infrastructure offers clear advantages over tape. The easiest way to do this is through the use of VTL technology. If you son't want to incur VTL acquisition costs, other disk-based approaches that employ disk-as-disk targets can be used as well, but require more administrative sophistication and will incur additional management overhead.
About this author: Eric Burgener is a senior analyst with The Taneja Group. His areas of focus include data protection, disaster recovery, storage capacity optimization and archiving.
This was first published in July 2008