Data file archiving is used to reduce the amount of unstructured information on primary data storage by archiving files to secondary or tertiary storage, allowing for more data growth in an organization. Although data storage for small- to medium-sized businesses (SMBs) differs in scale and complexity than enterprise data storage, their file archiving requirements and needs are often the same. Greg Schulz, senior analyst at StorageIO Group, outlines what file archiving is, affordable file archive tools for SMBs and tips on how SMBs should get started with file archives. His answers are also available as an MP3 below.
Most data growth in organizations is centered around unstructured files: Word documents, Excel spreadsheets, PowerPoint presentations, PDFs, videos and audio files, which is why file archiving is so important -- it archives all of the unstructured files in an organization.
This tends to be a gray area. Part of that reason it's blurred is because ILM is a way of thinking; it's a way of managing, governing and protecting data, making sure it's available when and where needed in a cost-effective manner. That's where it blurs with file archiving, which falls under the broad umbrella of an ILM.
On the other hand, file virtualization can get blurred with this distinction, depending on your definition of virtualization. To some, file virtualization is only virtualization if it's software running on an appliance with third-party storage. However, in a bigger and broader definition of virtualization, it's a principal and enabler for transparent movements. Its basic tenants are abstraction, emulation, enabling, agility and flexibility. So there's a common thinking that virtualization is around consolidating, squeezing and compacting -- that's one tenant. But the other is around abstraction, emulation, transparency and agility. In other words, things that seem to be able to transparently move a file from point A to point B, and when you go to look for it, you may still think that it's at point A, but it's really at point B. That's why this is an area of convergence.
The difference between file archiving in large enterprises vs. SMBs ultimately comes down to scale -- scale in terms of the size of the business, complexity and what the focus of the business is. In other words, you could have a large financial institution that is under certain regulatory requirements and has to archive certain information for certain periods of time. An SMB environment might be under those exact same requirements. For example, you might be working in a life sciences or a medical healthcare-related environment, which means that entire business might be under HIPPA requirements, whereas a large enterprise may only have a portion of their business under those requirements.
Also, keep in mind that file archiving is not just for compliance. Before the compliance buzz came around eight to 10 years ago, file archives were used as a time test technique for data storage management, optimizing data storage space and capacity, optimizing performance, reducing data backup and enabling disaster recovery (DR) -- or in other words, stretching your dollar. We're now back to that point of looking at archiving, and realizing, like compliance, that it is an enabler for organizations to become more cost-effective, greener, more efficient to speed up their data backups and basically a way to stretch your IT dollars.
This touches on the previous point where when you think about file archiving, you think compliance, or when you think about compliance, you think file archiving. We think in these terms because that's where the messaging has been for the last eight to 10 years. However, the reality is that any type of data and any type of file is a candidate for file archiving, regardless of the size of the organization. This applies to all types of data from email attachments to regular-type files. In order to determine if your organization or application is a good candidate for file archives, look at all the different files that you have, whether they're on your laptop computer, desktop, work station, server or multiple servers. Then ask yourself the following questions: When was the last time some of those files or documents were accessed? When were they actually utilized? If a long time has passed since you accessed specific files, then you should get them off of your primary storage and put them onto a secondary or tertiary-type storage. If you get the data off of your primary storage, you can free up some space to support more data growth. But overall, any organization, regardless of the size, business and applications, can be a candidate for file archiving.
What data file archiving tools are available? Are there different types for different types of data?
Different tools are tied to different types of data. For example, email data is tied to emails and file archives are tied to unstructured files. There are also tools that are tied to particular application types. Many tools are optimized to work with video or audio-type files, text files, or Word documents. Some tools incorporate resource analysis, resource reporting, e-discovery and file reporting along with file movement, management and target storage. There are also other tools that cater to just e-discovery or classification. Some are the policy managers, and even some tools are the actual storage devices. In an SMB data storage environment, it may not be necessary to invest in several different file archiving tools. Due to the scale and simplicity of smaller business data storage environments, you may be able to get an all-in-one or multi-function toolset that allows you to do more with what you have.
You'll be able to get the most affordable tools for your SMB if you understand your data storage needs. To understand your needs, ask yourself the following questions: Are you trying to archive everything in your environment? Are you trying to find a package that does email, file and database archiving? Are you looking for nice-to-have features vs. what you have? Look at several different products that will fit your needs. And there are several of them out there. You have startups like Acronis Inc., established names like CA, Dell, EMC Corp., Hewlett-Packard (HP) Co., IBM Corp., the media makers like Fujifilm Corp., hosting sites like Iron Mountain Inc., other vendors like NetApp, your traditional media types like Quantum Corp., and vendors like Symantec Corp. and Tek-Tools Software. They all have different offerings that can scale down to meet the needs of small businesses, but also allow flexibility.
What are some tips regarding how to get started with archiving, or, taking the next step based on what we have discussed today?
The number one tip is to have an understanding of what files you're going to archive and why. For example, if you need to do compliance, you need to have tools that will meet those needs. But also, are you going to try to do everything, or are you going to take bite-sized chunks? Don't get ahead of yourself. Establish a strategy first. Understand what you're going to do, how you're going to do it, and then start to work towards that goal, identifying what data you have along the way. The key to all this is having insight and awareness of what you have, what you use, and knowing what you can move where, when and why.
Gain awareness. Awareness is critical. Having insight into your data storage environment is probably the most valuable step towards file archives, because if you know where you're at, then you can determine where you need to be, and then you can figure out how to get there.