There is also some talk about hash collisions. A hash collision is when two objects generate the same CAS address or value, but the content is not identical. A CAS system would read the CAS address, which is the unique identifier that the content has generated, indicating an instance of duplicate data causing data loss. The chances of this happening are pretty infinitesimal. But as long as there is a chance, there is a worry. So, most CAS vendors have addressed this concern by using multiple hash schemes in their algorithms.
Don't assume CAS is all that is needed to meet regulatory requirements and if your CAS vendor is telling you that, find a new CAS vendor. That's a trap -- don't fall for it. Application-specific software, policies and policy enforcement, and compliance best practices all need to be considered as part of an overall archiving strategy.
Check out the entire CAS FAQ.
This was first published in June 2008