Getting started with data deduplication

When it comes to data storage, many small to midsized businesses (SMBs) have requirements similar to those of larger firms. SMBs are often proportionally more prolific in information generation. They are more nimble, and tend to be early adopters of emerging technologies to reach their target markets. The SMB market needs storage tools that meet its growing needs. The following are questions SMBs should consider when getting started with

Requires Free Membership to View


What is the difference between software- and hardware-based data deduplication products?

A software-based product involves loading an agent on the server where deduplication will be performed. The upside to an agent-based solution is ease of ongoing management and reduced network devices. The downside is the performance impact on the storage servers, including transactional processing and subsequent CPU cycle impacts. These drawbacks can be somewhat mitigated with the use of mirrored storage servers, high-performance servers or both.

Want more on deduplication? Check out these resources.  
Data footprint reduction for SMBs

Restoring deduped data

The pros and cons of globally deduplicating data backup appliances
Hardware-based dedupe products are appliances that sit between the switch and the storage server. The appliance relieves servers of the impact of software-based deduplication by performing the deduplication. Many appliances also provide compression and encryption, further enhancing the deduplication solution. The downside to using an appliance is the addition of yet another device to the network and the need to monitor and manage it.

How does data storage, deduplication and regulatory compliance come together?

The answer to this question depends on the regulations your industry needs to comply with. Data storage and accessibility regulations include the Sarbanes-Oxley (SOX) Act, the Healthcare Insurance Portability and Accountability Act (HIPAA) and the Gramm-Leach-Bliley Act (GLBA). Each regulation has its own framework and objectives that you must be able to meet.

For example, if email must be stored in excess of 50 years, then deduplication is mandatory, if only from a manageability and retrieval perspective. Though you cannot rotate or overwrite disks and tapes as part of your backup cost containment strategy, deduplication will significantly help reduce the number of disks and tapes required for secondary storage.

What is the difference between file-level and block-level deduplication?

File-level deduplication seeks and replaces entire duplicate files with a pointer to the first file instance. Block-level deduplication looks deeper into the file and only backs up the changed block. Block-level deduplication further improves a firm's data storage footprint by capturing only the components that have changed rather than the entire file.

Martha Young is principal and CEO of Nova Amber LLC, a business consulting company specializing in business process virtualization.

This was first published in October 2008

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.