ISCSI or FIber

beaconlightboy
beaconlightboy used Ask the Experts™
on
Ok, We are going to be virtualizing our entire system here over the next year and we obviously need a SAN.  I engaged Dell (equalogic) and IBM (DS4700) to provide me with solutions.  Now i have worked the vendors over for about 3 months and have them practically giving me the stuff.

both solutions meet the following needs
- 16TB of RAW space
- Primary box at main datacenter all 15K 450GB drives
- Second box across campus all SATA drives
- Both solutions provide snapshots/volume copy and replication.

The IBM solution is fiber and is obviously going to be faster on the network side.  It has dual controllers that have 4 4GB ports each.  So that's a total of 32GB's of controller thruput.  Each server will have dual 4GB NICS.  It also replicates to the remote box via direct fiber connections.

Now... the IBM solution doesn't have a nice interface.  It sucks.  It also doesn't provide any reporting or trending tools with the unit.  It doesn't automatically auto tune itself (i.e. add spindles as needed)  It requires knowledge of fiber channel san networks that i don't have any experience with.  it does however allow me to add drives individually and of the SATA or FC flavor.  The expansion bays are affordable as opposed to buying a full equalogic box.

On the other side..
The equalogic solution provides us with two controllers to the IBM one, and since it is failover only on their controllers, you get 4GB per box, so that's 8GB of total controller throughput.  The servers all have 4 1GB ports for a total of 4GB per server.  There is a significant difference in bandwidth in the two products.  Now I keep hearing about MPIO and i'm no expert on it, but heard that even with MPIO the SAN can only talk to one nic port at a time.  If that's true, then the bandwidth needle on the equalogic just went down.  But they say MPIO allows you to use all the NICS to get 4GB per server if the box has 4 1GB NICS.

With all that being said.  The IBM solution is only 10K more.  I won't give out numbers but i can tell you that if you know how to work deals you can get fiber for almost the same cost as ISCSI, at least in the Entry/Mid range market.   Anyways, this may seem like a no brainer but i just wanted to see what the experts think about my situation.  Especially with 10GB NICS coming out.  the equalogic is supposed to support that upgrade.  I fear that i am losing a lot of administrative functionality and ease of use that in a small shop is important.  I also can't afford to be wrong and end up with a product that will be 'IO'd out' if you know what i'm saying.

any help would be appreciated
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Author

Commented:
Oh an note each vendor is tellilng me the others technology is going away.  Dell says FC is going away and IBM says ISCSI is going away.  I doubt either of them are going anywhere, anytime soon.  Oh and just in case it matters, we will be virtualizing our SQL and Exchange servers.  Dell says not a problem on ISCSI, but IBM says na na.  Fiber best for SQL and Exchange.  
They're both right. Converged networking is now here and 10Gb Ethernet replaces them both with Fibre Channel over Ethernet and ordinary IP. It's still a little expensive, but it is the future. It's also referred to as Data Centre Ethernet. Having said all that, DCE leverages the Fibre Channel Protocol to overcome a lot of Ethernet's limitations, an dthe underlying storage protocol remains SCSI, whether it you use FC, iSCSI or FCoE.

>both solutions meet the following needs
>- 16TB of RAW space
>- Primary box at main datacenter all 15K 450GB drives
>- Second box across campus all SATA drives
>- Both solutions provide snapshots/volume copy and replication.

Some things to note: relying on raw space for sizing is extremely dangerous - you need to know how many IOPS your environment is generating. For example, your raw space requirements could potentially be fulfilled with 8 2TB SATA drives, but this will give you a maximum performance of 640 IOPS - 1000 IOPS. 16TB of 450GB drives will give you 7000 IOPS. A VMware (or any virtualisation environment for that matter) generates highly random I/O patterns, so you need to size the storage array with that in mind.  Also bear in mind that IOPS has an inverse relationship with bandwidth. When IOPS are high (that is; I/O is highly random), bandwidth will be low (it's common to see VMware environments generating 3000 IOPS and 20 MB/sec - no, that's not a typo). Conversely, low high bandwidth indicates a sequential I/O load and therefore throughput will be low. Sequential I/O loads include media serving and backup to disk.

>The IBM solution is fiber and is obviously going to be faster on the network side.  It has dual controllers that have 4 4GB ports each.  So that's a total of 32GB's of controller thruput.  Each server will have dual 4GB NICS.  It also replicates to the remote box via direct fiber connections.

The throughput of the front-end ports is almost 100% completely irrelevant. What matters (again... :-) ) is IOPS.

>- Primary box at main datacenter all 15K 450GB drives
>- Second box across campus all SATA drives
Be very careful performing synchronous replication from FC to SATA drives as you can easily overwhelm the available SATA performance at DR and affect production hosts. Ugly.

Until FCoE/DCE becomes ubiquitous and heaps cheaper, I recommend going with the FC option. The big advantage is low latency - important for any database application. Fibre Channel switching infrastructure is not expensive, especially when compared with implementing iSCSI properly.
To get a quick look at how much I/O your environment is generating, take a look at perfmon on Windows servers. Open perfmon, go to PhysicalDisk, select Current Disk Queue Length, % Disk Time and Disk Transfers/sec. You can then calculate Disk Service time = % Disk Time / Disk transfers/sec.

Disk Transfers/sec will give you the IOPS the host is generating - this is the important metric.
Disk Queue length shows if the host is waiting on disk access, and gives you a good idea of how hard the system is hitting disk.
Disk Service time is the disk response time which shoulkd be under 20mS

Add the IOPS of your hosts together at peak load and you have the total host load. You can then calculate how many disks you need:

Assume 50% read, 50% write. For the purposes of this example, assume you've measured 4,000 IOPS peak:
50% of 4,000 = 2,000 read IOPS
50% of 4,000 = 2,000 write IOPS

Calculate the RAID 5 write penalty (4 disk operations for each host I/O):
2,000 x 4 = 8,000 write IOPS at the disks
so total workload is 2000 read IOPS + 8000 write IOPS = 10,000 IOPS

Each 15K FC disk can handle 200 IOPS, each 10K FC disk can handle 140 IOPS and each SATA drive can handle 80 IOPS, so the number of disks you'll need are:

15K FC: 10,000 IOPS / 200  = 50 disks
10K FC: 10,000 IOPS / 140 = 72 disks
SATA: 10,000 / 80 = 125 disks

So you can see where the problem with sizing on space requirements lies. If you use SATA drives, you'll need 125 for the workload - or 125TB raw! Even 450GB disks will give you 22.5TB. If you size on space, you can see that you won't have anywhere near enough disk drives to handle the workload. Imagine trying to place the workload of 125 disk drives on only 16!
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Didn't we forgot that SSD rules the IOPS world ?

One single Intel X25-E SSD can sustain the IOPS load of 13x 15krpm HDD !
So you only need 4x Intel X25-E SSD !

Did you look for 2U based white boxes (with 4 hours onsite service support) with OpenFiler like that :
  • 2U rack allows up to 24x 2,5" (like the SuperMicro SC216) or 12x 3,5" hot swap drives
  • All SATA config below are using Enterprise class drives with a 1 per 10E15 UBE/BER
  • All SAS config below are using Enterprise class drives with a 1 per 10E16 UBE/BER
  • "Max Capacity (12TB RAID 10 or 16TB RAID 60)" storage server using 12x SATA 2TB
  • "Capacity (6TB RAID 10 / 1900 iops)" storage server using 24x 2,5" SATA 7.2k 500GB $5k
  • "Capacity (3.6TB RAID 10 / 2200 iops)" storage server using 12x SAS 15k 600GB $8k
  • "Capacity (3.6TB RAID 10 / 2900 iops)" storage server using 24x 2,5" SAS 10k 300GB $10k
  • "Mixed (4TB RAID 10 + 1.7TB RAID 5/ 5000 iops)" storage server using 16x 2,5" SATA 7.2k 500GB + 8x OCZ Vertex 250GB $9k
  • "Mixed (3TB RAID 10 + 2.5TB RAID 5/ 7000 iops)" storage server using 12x 2,5" SATA 7.2k 500GB + 12x OCZ Vertex 250GB $11k
  • "IOPS (5TB RAID 60 / 12000 iops)" storage server using 24x OCZ Vertex 250GB $17k
  • "IOPS (3TB RAID 10 / 15000 iops)" storage server using 24x OCZ Vertex 250GB $17k
  • "IOPS (3.2TB RAID 60 / 35000 iops)" storage server using 24x Intel X25-M 160GB $15k
  • "IOPS (1.9TB RAID 10 / 50000 iops)" storage server using 24x Intel X25-M 160GB $15k
  • "IOPS (1.3TB RAID 60 / 40000 iops)" storage server using 24x Intel X25-E 64GB $19k
  • "Max IOPS (0.7TB RAID 10 / 55000 iops)" storage server using 24x Intel X25-E 64GB $19k
For the fun of it, just look at the price offered by your nice brands !
No, BigScmuh, I didn't forget.

>One single Intel X25-E SSD can sustain the IOPS load of 13x 15krpm HDD !
>So you only need 4x Intel X25-E SSD !

Yes - absolutely correct . However there is a world of difference between personal storage flash drives and enterprise flash drives. Flash is the technology that will overtake SCSI and FC, but it is still relatively expensive, and an apropriately configured storage array with conventional disks will provide the performance required with plenty of space. I am quite convinced that we'll see a massive decrease in deployment of Tier 1 disk (15K and 10K FC and SAS) within the next 18 months to two years to be replaced with flash drives. Smart arrays will have a layer of Tier 0 flash drives and Tier 2 SATA drives and smarts in the array to move blocks in and out of flash as host perfformance requires. EMC already does this in their new high-end arrays (the Symmetrix V-Max) and will release the same technology in their CLARiiON arrays soon.

Your whitebox solution is nifty, but you end up with a box that you have to support yourself - Intel, OCZ and the good folk who have developed OpenFiler in their own time won't get out of bed for you at 2:00AM to resolve an issue with lost data.
Rather - you have to weigh the business risk against the low price. It's one of virtualisation's complicating factors - once you've got 20, 30, 40 or more servers relying on the physical hardware and it fails, the cost of lost data and lost time and recovery can easily outweigh the hardware savings.
Regarding 24/7 support, OpenFiler has an offer <$3k per year per node and even white boxes are covered by a 4 hours onsite service contract (That is very common service)

Regarding investment cost, SSD are cheaper (IOPS world) because you need a lot less SSD than HDD...and when you need more capacity go to the SATA world.

Regarding production cost, 1 SSD is about 3W where 1 HDD is between 20-30W, just evaluate the annual power bill reduction and you'll have some bucks to invest in more service.

Regarding the business risk, now you saw you can have a RAID 10 + hot spare + serviced white box at 1/3 price (at worst), you can buy some spare servers too...

Least one is reliability : SSD reliability statistics are not old enough to be really confident with...but they looks great (No moving parts)

Author

Commented:
Thanks for the feedback guys.  I did measure my IOPS for the systems.  I just listed everything in space cause that's how they sell the systems.  Compared to the systems you guys work on, mine is just a tiny thing.  Our exchange average IOPS are 20.  We only have 400 users.  I can only afford so much - so I spec'd out the best price/spindle qty i could for each vendor.

Could you explain this in more detail | Disk Service time = % Disk Time / Disk transfers/sec.  I'm not getting that formula.  How do you determine response ms from this?  doesn't make sense to me.

Commented:
I don't have time to involve myself in the purely technical/numbers side of this deal, but here's what I CAN tell you.  Real-world.


At my previous company, I bought two IBM DS4300 Turbo's.  Every feature activated, 64 storage partitions, flash, close, snap, everything.  Dual controllers, each controller with two 2GB/s LC connections.  8 drive trays, all FC drives, combination of 146GB and 300GB.  I forget the exact number, but the raw storage was in the upper-20TB range on each.  The SANs connected to a pair of Cisco MDS9506 SAN directors, and VSANs were configured.  I back-ended three Exchange boxes, and an HP-UX box with a 2TB production database.  Also ran an ESX Enterprise cluster that supported nearly 60 VMs.


New job:  iSCSI from LeftHand (owned by HP now).  1GB ethernet.  SATA drives.  CRAP.  GARBAGE.  I want to set it on fire.  It will be replaced next year by FC.


Yes, yes, what you say about FC is partially true - the interface doesn't have as many "bells and whistles" as the iSCSI interface does.  Why?  iSCSI is a software-based solution, and it's usually found in entry-level shops, where you have people with entry-level skillsets.  FC is in mid-size and up shops, people that have a little more skill, experience, bigger budgets (in terms of both equipment and staff).  FC requires a lot more know-how.  You need to understand zoning, plan your oversubscription, etc etc.  Is it going away?  No.  Is iSCSI going away?  No.  They both have their market segment.  DCE, yes, there's been lots of noise about it, but I'm not so sure.  Reports?  Trending?  Again, that's your software-based iSCSI crap doing that.  You don't eversubscribe an IBM array.  There are no reports.  You have this much storage, that's what you get, use it wisely.  You don't need to know how fast you're filling things up and expanding volumes, because it doesn't happen.  Build it correctly the first time, and it'll perform like a CHAMP.

& if you're THAT concerned about the speed, FC is already up to 8GB/s.  Why are you buying a 4GB/s SAN?

No matter how you slice it, at the end of the day, IOPs go like this:

SATA
SAS
FC
SSD


Cost goes up as you go.  If you can get FC for only $10k more than iSCSI, personally, I think it's a COMPLETE no-brainer.  Just be prepared for banging your head on the wall while you learn a new technology.  It's not just plug-n-play, as some vendors may have you believe.  The interface might be uglier, but there are WAY more things you can do in there, right down to controlling the block size on a per physical partition basis.  As for automagically expanding LUNs?  Not a fan.  The whole storage over-subscription thing is just a bad idea IMHO.  Storage should not be a "black box", especially if you're centralizing where all of that storage is.  I want to know where my data is, be able to configure different RAID levels for each different partition, carve up the LUNs how I see fit.  If I need a boot LUN, I just RAID1 a pair of 146's and go to town.  If I need SPEED, I can RAID10 16 drives and get fast.  If I need big, RAID5 8 300GB drives and go.  Exchange log partition?  4k blocks.  SQL database partition?  64k blocks.  Most of your iSCSI stuff doesn't allow for that.  You just tell it how big you want, and it "handles" the rest.

Check that IBM solution and make sure it includes SAN switches, though.  They are pricey, and the licensing is weird.

One last thing - FC doesn't use "NICs".  It uses HBAs (Host Bus Adapter).  And they are MUCH more expensive than iSCSI TOE adapters.  FC cabling is more $$ too, but I'm sure you knew that also.  :-)


HTH,
exx

Commented:
Great postings guys.. can you just clarify for me the different between FC disks and SAS disks. I thought FC referred to the transport mechanism from host to SAN, and the disks themselves were either SAS, SATA or SSD. rgds Simon

Commented:
At it's core, a disk is a disk is a disk (SSD excluded).  Same basic principle, spinning platters, read/write heads.  SAS/SATA/SCSI/FC are all simply different protocols for the physical drive to communicate with the storage array.  SSD is an architechtural departure, however, since it is solid-state (just flash memory, no spinning platters, no read/write heads).  However, they make SSD drives with many different protocol interfaces.

Different protocols allow for different speeds, & support different features (like hot-swap, for instance).  In general, the old adage still holds true:  You get what you pay for.  FC drives are designed to be enterprise-class drives, meaning higher MTBF, etc etc.  SATA drives, OTOH, are (generally) designed for use in end-user-level stuff, like Acer workstations.  LOL
> SAS/SATA/SCSI/FC are all simply different protocols
That's partially correct. The underlying protocol for SAS, SCSI and FC is SCSI. If a disk has the same spin speed, you'll get the same performance. SATA/ATA is a different kettle of fish. They have a lower spindle speed at 7200 rpm (WD Raptors are an exception here) and don't have the same on-board smarts as SCSI. For example, a SCSI/FC/SAS enterprise-class drive has two ASICS on the controller boar. One handles I/O, the other handles head tracking. A SATA drive has a single ASIC that does both. SCSI has Tagged Command Queueing and Tagged Command Re-ordering that allows the drives to get clever and re-order commands so that they're handled in the most efficeint way possible. SATA-II has Native Command Queuing, a subset of TCQ. SATA and ATA have no command queueing at all. If you're interested, Seagate had an excellent white paper: 'More than an interface  SCSI vs. ATA. By Dave Anderson, Jim Dykes, Erik Riedel. Seagate Research'. YOu can find it here: http://pages.cs.wisc.edu/~remzi/Classes/838/Fall2001/Papers/scsi-ata.pdf - it explains why enterprise drives are more expensive, and has some great insights into disk technology in general.

I can't see the SCSI protocol going anywhere anytime soon - servers have to have some standard method of communicating with storage and SCSI does the job pretty well.
WD started to put 2 cpu in their Sata drives...but one may keep its blind eyes on it

Author

Commented:
Does anyone know some affordable tools that can be used to monitor a SAN.  something similar to profiler but doesn't require me to cut off my left arm.
dlethe is an expert here at EE - he's involved (I believe) with the development of a SAN management tool: http://www.santools.com. It looks pretty groovy. Other options include Symantec's CommandCentral Storage (although that may be one that requires your right arm as well...). EMC has EMC ControlCentre, IBM has their own as does NetApp, Brocade has a tool, so does BMC and CA. NetIQ also has some funky tools. I suspect that all those will require bits and/or pieces of your anatomy...
Disclaimer: I've set up and used both CommandCentral Storage and EMC ControlCenter - they can both do some pretty powerful stuff, but in most instances, after a honeymoon period, they've fallen into disuse.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial