Solved

Managing SAN Volumes/LUNs efficiently

Posted on 2013-01-21
24
404 Views
Last Modified: 2013-01-26
I hope disk space is a issue for everyone. Our requirement and issue maybe bit different. We want to manage our SAN volumes/LUNs in chunks to minimize the impact of any disaster. For example, if our all users are being stored in one LUN and something happens to SAN or that LUN, all users will be affected; therefore, we want to divide them in multiple chunks.

So we got users' home folders that are taking 10TB space on our SAN. Currently we have only one LUN for it. Now we want to divide them in 5 chunks of 2TB each. I need suggestions, on which criteria we should distribute our users and how will we maintain it in future? or is there any other suggestion in this regard?
0
Comment
Question by:A1opus
  • 9
  • 4
  • 4
  • +3
24 Comments
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 38803898
I dont personally do home folder redirection but i think home folders locations are done with on the ou level with gpo so that's where I'd start looking.
0
 
LVL 2

Author Comment

by:A1opus
ID: 38803968
Well that's the second step when I divide my existing user volumes. I mean to say, when I divide my single SAN LUN in to 5, I will get 5 volumes where I need to copy my existing users' data. Then I also have to redirect users' home folder path to new location.
0
 
LVL 76

Expert Comment

by:arnold
ID: 38804005
The problem you will run into with this setup is that a folder can o log be redirected once. If a user transition from one OU to another, the process would require the redirection back to the profile and then from the profile to the new location.
A backup plan is a must with that much user data.
0
 
LVL 20

Expert Comment

by:SelfGovern
ID: 38804052
I don't know if what you're asking even makes sense, because you're not telling us what your underlying hardware looks like.

One of the advantages of a SAN can be storage virtualization.  Many of the storage subsystems that people put on SANs to hold their data virtualize the disks in such a way that it makes no sense for you to be worried about where the data physically sits.

So you take an EVA or XIV or other unit and it stripes the data across all the disks, and, well, if it gets bad enough that you lose one LUN, you've lost them all.  Because of the architecture inherent to these devices, your chance of losing any date is far less than if you partition them up and try to do the kind of LUN (micro-?)management you're talking about.

Even a device like the HP MSA -- unless you're working with really small disks, you're probably far better off creating a big LUN with high redundancy (RAID 10, possibly RAID 6) and implementing proper DR and backup processes, then worrying about how-to-lose-only-some-people's-data when something bad happens.

What storage system(s) are you using?  How are the disks configured?  How do you protect your data?
0
 
LVL 2

Author Comment

by:A1opus
ID: 38804069
Well,  first of all,  user home folders are configured in their account properties instead of OU based. We use OU based redirection for other drives via Group Policy Preference. OU based home folders are dangerous indeed.

My real challenge is to divide users among 5 LUNs. It could be done but on which criteria?
0
 
LVL 2

Author Comment

by:A1opus
ID: 38804079
Hi SelfGovern,

Thank you for your reply. You are right. Maybe I didn't explain well. In fact, our backup is taking too long to backup 10TB data; therefore, we want to divide LUNs in 5 chunks so we get 5 drives of 2TB. Then back and recovery will be easy for each 2 TB.

We are using Oracle Pillar Data Systems
0
 
LVL 38

Expert Comment

by:Aaron Tomosky
ID: 38804150
Sounds like maybe we should work on speeding up your backup instead of splitting up home folders ;)
How do you do backups?
0
 
LVL 2

Author Comment

by:A1opus
ID: 38804162
Through Microsoft DPM. This is what I am saying backing up and restoring 10TB data will take far more time than 2TB.

We can prioritize which drive we want to restore first in case of disaster. But in case of 10TB, we have to wait for whole restore.
0
 
LVL 76

Expert Comment

by:arnold
ID: 38804333
The difficulty is five LUNs represent five shares or five paths.

I think as others pointed out the differentiation is not as simple to equally disperse users across while avoiding users with too much information grouped together such that they run out of space.

Not sure how you have user data in the 10 TB range.
One way is to possibly use archiving where you would "offload" some unused user data to another LUN

\\domain\rootdfs\userfolders
\\doman\rootdfs\userfoldersarchive
You would then relocate files that have not been accessed in a certain amount of time into the archive.

Users will need to have access to the archive folder and be aware of moving the file back if they need one. Or you can have a monitoring process on an archive that will move a file recently accessed back to the live.

Depending what the user data that is part of the 10TB data, a document management system might be a better way to manage the data.
0
 
LVL 36

Accepted Solution

by:
ArneLovius earned 400 total points
ID: 38807947
The only accurate method is to do it manually, this of course is an administration nightmare..

I would take a list of usernames, record how much data each user has shared, add up all of the users data split into 24 parts by initial letter of username.

Find the 4 split points that are closest to equal, if you have huge spike in one area (such as 1/2 of the users are called Robert), you might use the surname to split the users instead.

The key point is that you need something that is simple to administer, calling the user data root shares (or servers) "A-F" "G-L" etc then makes it simple to put each new users folder in the right place.

I am presuming that as you are using DPM for backup, the shares are on a Windows file server.

Splitting a single LUN into 5 will not increase the speed of the DPM backup, this is because the backup will still run sequentially, if however you are splitting the LUNS into 5, you could also have each LUN on a different file server, then by parallelising the backup you could have a speed increase, obviously the the DPM server bandwidth and DPM storage bandwidth also have to be considered.

For 10TB of data I would suggest having a "mirror" copy (DFS replication?) of the live data and run backup from the mirror, if you used a DFS namespace for the shares, then in the event of an outage you can go live with the mirror much faster than any restore. As the "mirror" is only there for DR purposes, it could be running on significantly lower hardware than the main server and share. The ROI calculation is quite easy, take your recovery time and multiply it by the hourly cost of not having the data, you can also add the reduction of restoration management time...

If you are already using VSS on the main storage, you could reduce the amount of (expensive) VSS storage on the main server, and use more VSS (inexpensive) storage on the mirror, keeping DPM for "archive" level retrieval and DR capabilities.

You could also backup from the mirror, removing the backup load from the file server(s), the file server(s) would still have the load of copying to the mirror server, but this would be on new or changed file rather and as such hopefully more evenly distributed than just the backup load.

I did something similar for a client that had ~15TB of main file shares. They put in an "inexpensive" mirror using DFS Replication, reduced VSS to cover 24 hours on the main file server and were able to delay replacing the main file server storage by 12 months. With the reduction in £/TB, the mirror server had a positive ROI in under 12 months, they were also able to increase VSS to cover 2 months, which had the side effect of reducing the number of required restores from backup (people deleting/overwriting a file accidentally and not realizing till later) to zero :-) In their case, after the initial replication (after pre-seeding), the mirror server and DPM server were moved to a different (linked with a private point to point gigabit connection) site to further increase DR capabilities.
0
 
LVL 2

Author Comment

by:A1opus
ID: 38808188
Dear ArneLovius,

Excellent reply! In fact, u understood my question perfectly; this is what I was looking for.

Yes, we are using Windows file server. We are already mirroring our data to two offsite, one is in nearby building and other is out of country. Currently we are using Inmage to replicate our data to abroad and DPM to DPM for nearby site. How about that?

Second, if we split our data on the basis of departments, how about that? Will it be more manageable?
0
 
LVL 2

Author Comment

by:A1opus
ID: 38808201
And yes one thing more, we are using 3 DPM servers.
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 36

Expert Comment

by:ArneLovius
ID: 38808244
:-)

As it is a shared resource...

If you split by department and add a new department, which LUN would you put it on ?

Splitting by department also means that you should move a users data if they move department, this is increased management overhead.

I would split by name, it will give the most equal balance now and in the future, there is no requirement to move a users data, it is simpler to manage as you don't have to remember which LUN a given department is on.

If however you have thin provisioning on your SAN, I might be tempted to have a LUN per department, this gives you a quick and low impact view of how much storage each department is using, how much is used on the volume rather than having to enumerate the departments users folders.

I presume that you are not using either Windows Single Instance Storage, or deduplication on the SAN, if you are, your storage requirements may go up.
0
 
LVL 20

Assisted Solution

by:SelfGovern
SelfGovern earned 100 total points
ID: 38808515
If the problem is that your backup takes too long, I'm inclined to suggest you *fix the backup*, not that you take on all manner of new complexity in an effort to make it look like you don't have a backup problem.

There are a couple of possible solutions to this kind of problem.  One is to use a backup such as HP's Data Protector with incremental forever scheduling and synthetic full backups.  You run a one-time full backup to disk, and after that you only run incremental backups -- so your backup window is only how long it takes to run an incremental backup.

Then periodically you tell the program to use the data from those incremental backups to create a synthetic full backup -- which looks exactly as if it has been created as a full backup when the last incremental was run.  It sounds more complicated than it is.  If you don't use tape today, you will have to buy some tape HW -- probably an autoloader at least, and a bunch of tapes.  But you'll have a much more robust solution than you have today without tape.

Alternately, invest in oh, probably three storage subsystem that allow you to do remote snapshots.  What you're going to do is cascade the snapshotting, and then take a backup off the junior member... but this will involve a significant investment in storage hardware.
0
 
LVL 36

Expert Comment

by:ArneLovius
ID: 38808566
@SelfGovern have you looked at how DPM works ?
0
 
LVL 2

Author Comment

by:A1opus
ID: 38808713
@SelfGovern:

Thanks for your valuable input. This discussion is getting very interesting. You are right that there could be issue in our backup. In fact, there was when we were using Symantec Backup Exec, it was too slow in every perspective. Few months ago, we had a disaster when our two HDD were went bad in SAN; therefore, our whole data was pinned. We had a nightmare to recover data from our replicated one because we had to run chkdsk on that 10 TB volume. If we had the smaller chunks, we were in better position.

Therefore, we are splitting our data up in to 10 volumes (2TB each).
0
 
LVL 76

Expert Comment

by:arnold
ID: 38808768
Using redirecting GPOs that revert would stream line the copy back and redirect
I.e. the folder redirect is on the Site/OU level if the user falls out from the policy, the redirect reverts. The implication is that following a transition from one location to another, the user login will take a long time when the data is copied back and then when the data is copied out to a new redirect rule.

You seem to be locked into one approach which will complicate matters rather than improving it. I.e. a LUN or splitting LUNS depending on the SAN multiple LUNs might fail when two disks fail.

Logically reducing the data based on usage/need will reduce the overall amount speeding up access as well as simplifying management and backups.
0
 
LVL 20

Expert Comment

by:SelfGovern
ID: 38812995
Arne asked:  @SelfGovern: have you looked at how DPM works ?

<sigh>   Yes.  I got distracted and ended up giving a correct but useless answer.
What makes Data Protector a potential standout here is that it does perform
multithreaded reads -- you can have several active reader process reading
data at once, so there's not the same kind of bottleneck from a less-flexible
backup program.

Maybe this is irrelevant again, but I don't recall seeing an option for DPM
multi-threaded reads.
0
 
LVL 2

Author Comment

by:A1opus
ID: 38813035
Hi,

Thanks everyone for their valuable inputs. We have already made the logic and going to split our drives up in 10 volumes. How those drives will be backed up, our backup guys will take care of it.

Now we need to copy data from one server to another, is there any other better way other than Robocopy? Robocopy works fine but if there is anything better out there, I wanna know.
0
 
LVL 76

Expert Comment

by:arnold
ID: 38813056
Copying is the simple path, the problem deals with updating the user profiles with the changes.
You have to use GPO to revoke the redirect, and then allow for the new redirect to take effect after two/three logins.  You should test it on one account transition.
The alternative is to use vbscript to modify the registry entries of the user to the new location.
Either way test first.
0
 
LVL 20

Expert Comment

by:SelfGovern
ID: 38813058
Perhaps this is a Whole Nother Question.
In general, it depends on what's on the drives.  If it's flat-file stuff, you can force people off and then copy it and get a good copy.  If there is any kind of database or state-wise information, or any process that write to the LUNs in the absence of users being logged on, kicking the users off won't necessarily get you a consistent (and therefore safe, usable, working) copy.

The only way to ensure a consistent copy in that case is to detach the LUNs and copy when they are off-line.
0
 

Expert Comment

by:ajs21
ID: 38814733
I have similar environment.  My approach is the following:
Create Raid-5 volume for User shares, configure nightly snapshots of data, backup monthly FULL.  Data loss only happens if more than 1 drive fails.  Users can recover data w/i last 29 days independent of IT resources.
0
 
LVL 36

Expert Comment

by:ArneLovius
ID: 38815922
@SelfGovern

That's really useful to know about HP Data Protector, thanks!
0
 
LVL 2

Author Closing Comment

by:A1opus
ID: 38822521
After long time, I got the most valuable replies on Experts Exchange but unfortunately I have to give points to those replies that has given solution to my need.

Thanks heaps for all of y'all.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

How to update Firmware and Bios in Dell Equalogic PS6000 Arrays and Hard Disks firmware update.
A Bare Metal Image backup allows for the restore of an entire system to a similar or dissimilar hardware. They are highly useful for migrations and disaster recovery. Bare Metal Image backups support Full and Incremental backups. Differential backup…
In this Micro Tutorial viewers will learn how to restore their server from Bare Metal Backup image created with Windows Server Backup feature. As an example Windows 2012R2 is used.
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now