Data deduplication is one of the hottest technologies in data backup and recovery. But before you implement a deduplication product, you should consider whether it provides global deduplication.
Global deduplication dedupes data from multiple deduped sites, targets, vaults, etc. In other words, the multiple deduped locations are rolled up into a larger combination, which is then deduped among all of them. This implies that the ultimate deduplication target must have the capability to scale up and/or out in both capacity and performance.
Some vendors claim global deduplication does not matter. These are the vendors that currently do not have global deduplication or are seriously scalability limited. Products from FalconStor, Sepaton Inc., IBM Corp., NEC Corp., Asigra Inc. and EMC Corp.'s Avamar all support global deduplication.
Why global dedupe matters
Global deduplication matters for the following reasons:
Larger dedupe targets equate to higher deduplication ratios. This is not a huge factor if there is not a lot of duplication between sites.
The bigger issue is performance and manageability. Without global deduplication, management increases exponentially. Specific servers must be aimed at specific targets and typically more of them. Backup performance is limited to a range roughly between 230 MB/sec and 460 MB/sec. A performance requirement exceeding that range requires additional targets. If a single application requires performance greater than that range, then by default, the target must provide global deduplication. Management tasks are known to increase at a much faster rate than the number of targets.
- No global deduplication also means no load balancing or high availability. Each backup is aimed at a specific dedupe target. If that target goes away, so does the backup. If the backups are load balanced between two or more dedupe targets to provide failover, the amount of deduplication declines by a factor of the number of targets. Two targets divide the deduplication ratio by two, three targets by three, and so forth. This approach does not increase target scalability very much and equates into a much higher TCO. And once again, the increased number of targets causes significantly more management.
Global deduplication for SMBs
How does global deduplication affect SMBs? A deduplication system that does not provide global deduplication may be perfectly adequate with today's requirements, but not so much in the near or distant future. That's when the issues discussed above will become a problem.
It will come down to determining your backup requirements right now, the requirements over the life of your system, the TCO of the different solutions and your system's ability to manage the different deduplication solutions.
Another alternative is to utilize the cloud-based online data backup service provider. Many of them (such as Digitiliti Inc., Hewlett Packard Co., SunGard, Venyu, and numerous others) provide both local and global dedupe as part of their service. The service providers are able to provide competitive pricing with extensive capabilities by leveraging the cost of the software, local dedupe, global dedupe and the storage across multiple customers.
About the author: Marc Staimer is the founder, senior analyst, and CDS of Dragon Slayer
Consulting in Beaverton, OR. The consulting practice of 11 years has focused in the areas of
strategic planning, product development, and market development. With more than 28 years of
marketing, sales and business experience in infrastructure, storage, server, software, and
virtualization, he's considered one of the industry's leading experts. Marc can be reached at
This was first published in July 2009