Link to home
Start Free TrialLog in
Avatar of fireup
fireup

asked on

Large RAID setup

This question is related to - https://www.experts-exchange.com/questions/22728657/Server-Purchases-and-Setup.html

I'm looking for advice on how to setup a large array of disks. My hardware setup likely will be DL380 G5 with a P800 controller hooked to a MSA70 shelf of 2.5" 146GB disks. I will be starting off with about 16 disks.

This is my first go at setting up a large disk array and I could use some industry experience on how to do it? best practices? security concerns? recommended partion sizes? recommended RAID levels? Etc etc & 

Can anyone help ?
Avatar of captain
captain
Flag of United Kingdom of Great Britain and Northern Ireland image

You need to let us know what you intend to use it for to give recommedation on partition sizes, going for a large array means that you have a need for large capacity, so certain RAID levels will be more suitable if you look at the RAID overhead:
http://www.raid-array.co.uk

have a look at this www, it explains all levels of RAID and pros/cons

What is important is that you consider the heat the drives will generate, as heat is also one of the biggest fail over reasons, inadequate cooling will cost you $$$ for new drives...so best think of airconditioning that takes the new heat levels into account.

hth
Avatar of SysExpert
In addition, what about back ups , long term storage, speed, redundancy. The application type affects this, as well as growth rates, and what amount of storage you will need in 1 or 2 years.

I hope this helps !
ASKER CERTIFIED SOLUTION
Avatar of Member_2_231077
Member_2_231077

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of fireup
fireup

ASKER

Thanks for all the suggestions so far, to answer one of the questions above - I will be running an application that will be using a SQL 2005 backend which will hold medical data and large images. I will have a user count of around 1000 to 2000 with any number of users hitting this database at the same time via the web application. Security and Performance are my two biggest concerns.
I don't think it's a good idea to store the actual images in SQL, but you will have to check in the SQL topic area to verify.

You haven't got enough disks for 2000 concurrent users, you'll have to estimate how many will actually be retrieving or sending data/images at one time then give them something like 1/10th of a disk each so 1000 concurrent users get 100 disks. Concurrent does not count users who are just staring at the screen trying to work out how to perform a brain implant of course.

Security is down to SQL and NTFS permissions on the backend and how you authenticate the users on the front end, I don't think you'll want this integrated with Active Directory or it may mean 1000 user licenses for Windows as well as the processor license for SQL.
Boy, you are going about this the WRONG way.  You are crippling yourself before you begin.

First, use 3.5" drives, not 2.5, their latency and seek is WAY TOO SLOW for a serioud database

Second, if you need a 2 TB array, run TWO 500GB SATA drives in a raid 0 array, and install another 2 IDE drives in a raid 1 array, to mirror the two SATA drives in the primary array.  This way, you get the high speed data throughput of the primary array, but you are backing it up to a secure IDE mirror array that will be recoverable from any system, in the event of failure.

And finally,don't make the mistake of buying WD drives for any of this array.  They die too soon.  Buy only Hitachi IBM, as they are the most reliable on the market.  You will realize I am right if you buy WD.
>First, use 3.5" drives, not 2.5, their latency and seek is WAY TOO SLOW for a serious database.

I suspect you are thinking about laptop/notebook drives scrathcyboy; these 2.5" SAS disks are not designed for the laptop but for the datacentre. They require less energy and therefore create less heat than the 3.5" ones so as well as being able to get almost twice as many per U you don't have to double up the power supply and aircon to compensate for having twice the disk density in the rack.

Here are the seek time specs of a 2.5" 72GB 15K SAS disk:
Single Track 0.20 ms
Average 3.0 ms
Full-Stroke 7.0 ms

I don't think you are going to find a 3.5" disk with specs like that whatever interface it has.
@scrathcyboy
with respect andy is right, these drives are THE choice for this setup, it is not a question of which drives but of which setup....

@fireup, andy is the man here to help you solve this and has given some good advice, sysexpert makes a valid point too re backup.

Let us know if you need anything else
The thing I'm not sure of is whether it is better to store images in SQL as BLOBs or use pointers to files.
Avatar of fireup

ASKER

andyalder, thanks for all the great advise so far. I will be confirming today with the developers of the application on how the web application will be handling images. Once I hear back I will let you know.

As for the backup situation, I have something in mind but leveraging this form has really helped me out, would you have any recommendations for a backing this amount of data up ?

Also your equation on how to determine the number of users per one disk, is there any supporting documentation anywhere that I could read further on this ? I just want to ensure I get up to speed on this information.

Thanks again for all your help.
Avatar of fireup

ASKER

And one other thing, on the topic of drives to use. I was going to fill the MSA70 with HP 146GB 10K SAS drives not 15K as they don't make that drive size in 15K.

Would you recommend going with 72GB 15K's and more MSA70 shelfs or just stay with the 146GB SAS solution. The only reason I ask is the seek times that you talked about above. The 146GB drives seems to be 1ms slower then the 15K drive.  
The 1/10th of a disk per concurrent user was a bit of an educated guess, going off the concurrent user count HP list at http://h18006.www1.hp.com/storage/disk_storage/storage_servers/index.html. Admittedly that's file server but since you're storing images I think that's valid figures.

I think the 10K 147GB will be fine, if it was a pure SQL database then I'd go for 15K 72GB but large files (and BLOBs) are partially sequential access so the head seek isn't as important as with random data. I don't know when the 147GB 15K SFF will be out but I would expect them to be available soon.

For backup I'd get an LTO3 library but I would expect the images are static once uploaded so you would only need to back them up once or twice rather than every day. Differential backups would be best for this since they only backup the difference between current data and last full backup, unfortunately differential doesn't work very well ith databases, it's really for file backup.
Avatar of fireup

ASKER

andyalder, your help has been greatly appreciated and I will be giving you full points for your recommendations. I do however have one other question that I could use your expertise on.

In going forward with the two server setup i.e. one DL380 G5 running the web application and the second running the SQL backend with the MSA70 hooked to that server I'm putting together an upgrade path scenario with this setup in mind.

What would your recommendations be if the following where true 

1. application is slowing down due to application server not powerful enough to handle user count, mainly the processor speeds
2. SQL backend is slowed down due to volume of transactions, could be a combination of processor speeds and write times

Any considerations that could be made would be greatly appreciated, on the storage side of things I believe I am covered with the MSA70 and adding more of those units. I'm just concerned about the power of the servers and processors and how I would go about adding more power to my setup if needed.

Would the cluster setup be the right approach once we exhaust the DL380s ??
I would use network load balancing on the front end if you need two servers but thet won't work on the back end unless both are read only. Active/active clustering can't be done on the same database, only two or more DBs. I would be tempted to get the front end setup to commit any write data to both backend DBs since I suspect it is a very read intensive rather than write intensive app. That's really down to the programmers though, they might decide to put the images on a seperate server than the rest of the SQL data to scale it up.
Avatar of fireup

ASKER

andyalder, I've been trying to find further information on how to calculate number of disks per concurrent users as you mentioend about (1/10). I'm struggling to come up with anything that is creditable. I've been looking through the HP website but haven't had the best of luck.

As my last last question can you make any recommendations to me as to where  I can find information on how to make this call about number of disks per users. I really want/need to sharpen my skill set in this area.
There isn't any hard and fast calculation, it's just a rough rule of thumb.