Home > Storage Channel Tips > Data Storage Management > Using storage tiering to control customers' data growth
Storage Channel Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATA STORAGE MANAGEMENT

Using storage tiering to control customers' data growth


Seiji Shintaku, Contributor
11.02.2009
Rating: --- (out of 5)


Storage Channel Update
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Comedian George Carlin once called people's possessions "stuff." A house is just a place for your stuff, he said, and after a while you need a bigger house. Why? Too much stuff! The same is true with storage. It's a place for everyone's stuff, and after a while you need to add shelves or get a new array. Why? Too much stuff! While it's easy to joke about it, managing exponentially growing data is a difficult challenge that needs serious consideration. Storage tiering is becoming increasingly necessary as a remedy to the problem.

While there are multiple factors that cause data to grow and have a life of its own, it's mostly driven by email and digital media files. The files in the latter group typically aren't the kinds of files that are used for legitimate business at most companies, but they certainly have their way of making it into the corporate network. And during one data migration process for a financial services customer, it became apparent to me that much of that data wasn't important to the business; the users had just accumulated it over time and didn't want part with it even though most of the files haven't been accessed in years. They hang onto those files for a couple reasons: They think that someday they may need them, and/or it takes too much time to sort through and figure out what's really important.

When viewed from the employee's perspective, those extraneous files seem innocuous. But, of course, from the macro perspective, the data growth problem can add costs and headaches in many areas. For example, in the data migration project I mentioned above, while moving data from Windows servers to a network-attached storage (NAS) environment, the cutover nights took more time than was budgeted because users' Exchange email archives were so big (3 GB to 6 GB). When you're trying to migrate 100-plus users a week, that's a lot of data.

To try to control the problem of overburdened and slow email databases, many IT departments have instituted space quotas on email accounts. Many users respond to the quota by archiving their files to their home directory on the network -- which essentially shifts the burden from the email server to network storage devices and leads to the problem I ran into in the data migration project I just mentioned. Saving that email data onto storage systems has a large rippling effect. Data saved on a file system gets saved within snapshots, gets replicated to secondary arrays and gets backed up on to a VTL or traditional tape library. One way to combat that data growth problem and cut down on the costs associated with such bad user habits is by tiering data off of expensive primary storage.

Storage tiering

We know that a large percentage of data has not been accessed for over a year. For instance, one analysis I ran within a global HSM environment showed that we'd save roughly 50% of storage space by archiving files that had not been accessed in a year. (While some experts suggest that you can move 70% of data off of primary storage, I feel comfortable using 50% as the estimate. And while 50% is substantially lower than 70%, it's still a huge number.)

There are some things to be wary about when applying storage tiering on customers' unstructured data. For example, if the data is moved from a Windows /Unix environment to a NAS environment, make sure that the metadata (referred to as inode) of the files is retained. The most important metadata, at least when it comes to tiering, are the creation, modified and access dates. If that metadata isn't retained, after the migration, all files will have the wrong access date since it will change it to the current date when the file is moved. (To learn how to retain the inode information during a data migration, refer to the documentation on whatever migration tool you use, such robocopy or rsync.) If the original metadata is lost, it is impossible to tier data correctly.

Assuming the metadata has been properly retained, you should archive data based on the access date of the files; the access date changes every time a file is opened. (Using the modified date is not advisable since a file doesn't need to be modified to be of recent value to an organization. Think of how many times you open a Word or Excel document but never change it.)

One big mistake I have seen some customers make is to migrate their data but not to cheaper drives. In other words, they go through the pain and suffering of having an inline device move their users' data onto another drive, but the other drive is, for example, Fibre Channel. If the main reason you want to use storage tiering on your customers' data is to reduce their capital expenditure, then you need to migrate data to less expensive storage, such as SATA drives. If you're going to go through the process of moving data to control data growth, it should be done to save capital expenditure and not just disk space on the first drives.

For your customer's data that's not being accessed at all, you should to move it to a lower storage tier, such as a WORM, tape or VTL solution.

About the author

Seiji Shintaku is a principal consultant for RTP Technology. Before joining RTP Technology, he was global NetApp engineer for Lehman Brothers, Celerra and DMX engineer for Credit Suisse First Boston, principal consultant for IBM, and global Windows engineer for Morgan Stanley. RTP Technology is a VAR for storage-related products and professional services for NetApp, EMC, F5, Quantum, VMware and Brocade. He can be reached at sshintaku@rtptech.com.

Rate this Tip
To rate tips, you must be a member of SearchStorageChannel.com.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Tier Storage
IBM to replicate data for disaster recovery; Managing enterprise data storage more efficiently
Affordable tiered storage via data deduplication services
Data lifecycle management: What are the things organizations need to consider?
What does data lifecycle management mean?
Why is IT lifecycle management important?
Storage Capacity Fast Guide: Storage technologies
Storage management vendor Adaptec launches first-time partner program
Tiered storage gone wrong
Set up effective storage strategies

Data Migration Tools and Strategies
IBM fills storage system gaps; HP to ship external 6 Gbps SAS array
Data migration services FAQ
Advantages and implementation hurdles of three file virtualization approaches
Top five data migration tools
Data storage solutions: Free channel seminar
Unstructured data creates integrator opportunities
Data lifecycle management (DLM) services for SMBs
Replacing storage arrays without downtime
File area network (FAN) podcast -- trends in content management and delivery
Hitachi Data criticism unlikely to keep VARs away from new EMC storage unit

Data Storage Management
Two inroads to cloud data backup services
How to diagnose and solve customers' storage performance problems
Storage opex savings via monitoring, analyzing and automating
Efficient storage provisioning: Capex and opex savings
Greater storage efficiency: Cap-ex savings
How to resell cloud storage services
How to become a cloud storage services provider
How to improve power efficiency in archive hardware and on primary storage
Data reduction for disk archiving: Hardware vs. software approaches
Data reduction techniques for primary storage

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.

HomeNewsTopicsITKnowledge ExchangeTipsMultimediaWhite PapersBlogsEvents
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2006 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts