What are the drawbacks of using content-addressable storage (CAS) for archiving?

There are the likely suspects like cost, capacity, power and cooling because there is more data online than you have in alternative archiving solutions like tape. Some people discussed the performance limitations of CAS, but CAS was not designed to be used for primary I/O-intensive applications, it was designed to be used a secondary storage tier for an active archive. So, it's not expected to have a really high I/O performance.

There is also some talk about hash collisions. A hash collision is when two objects generate the same CAS address or value, but the content is not identical. A CAS system would read the CAS address, which is the unique identifier that the content has generated, indicating an instance of duplicate data causing data loss. The chances of this happening are pretty infinitesimal. But as long as there is a chance, there is a worry. So, most CAS vendors have addressed this concern by using multiple hash schemes in their algorithms.

Don't assume CAS is all that is needed to meet regulatory requirements and if your CAS vendor is telling you that, find a new CAS vendor. That's a trap -- don't fall for it. Application-specific software, policies and policy enforcement, and compliance best practices all need to be considered as part of an overall archiving strategy.

Check out the entire CAS FAQ.

This was first published in June 2008

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: