SQL Server resource for FC connectivity

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER

There are 2 HBAs (4 Gb/s each), both single port, and 8 disks to each. But I do not think the system is able to handle 600 MB/seconds, at all.. Other than HBA, any other components to be concerned about?

Are you using an external 16 drive FC enclosure? or are you connecting to a SAN that manages the drives?

Did not realize until now that you have two questions that are related.
Each HBA gives you ~400MB/s. Depending on how the harddrives are divided could be what controls how close to the top limit you can get in terms of performance.

Writing/reading data that is located on a single physical disk versus a LUN that is made up of several disks makes a difference. A single disk can only answer//do one thing at a time and there is the time it uses to reposition the head to the next location.
With the multi-disk setup, you have ~ "number of disks" actions that can be performed at one time.

SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER

Arnold, thanks for your pointers.

It is a SAN that manages the RAID drives. Does that make the 400MB/s attainable or far off?

Even though there are n number of disks and spindles, can the HBA's become a bottleneck? In the case above, what benefit happens when the single port HBAs are changed to dual channel(port) HBAs

thanks

ASKER

jimpen, the 15+1 disks should have their own spindles/head/resources, right? Then why worry about dividing it 9+6? Couldn't all the writes/reads be done parallel?
(you are talking about SAN Setup, right?)

With a 4Gb FC, I think 400 MBs is the "limit". In a high transaction/very active it could be I/O bound HBA/SAN processor/disks. i
Dual path could mean that the work on the SAN is distributed between the two processors.
i.e. db LUN is on one and the transaction logs is on the other.
LUN for DB LUN for transaction logs

I'm talking SAN setup. The way that SQL writes data (even simple recovery) is to write it to the .ldf first. Then it writes to the .mdf then writes back to the .ldf to make a checkpoint.

So in a 15+1 the disks have to move to the L: stripe, then D: stripe then back to the L: stripe.

The disks have to move in semi-concert to the L: then the D: then the L :.

Even with a disk cache you are jumping back and forth.

Doing the 9+6 the L: drive moves to the 6, then the D: moves for the commit and the L: is ready for the checkpoint.

ASKER

arnold, with a multi-disk setup in SAN, if the throughput is low, where could you look next in a high transaction/active environment? Add more HBAs? (especially if for example in a 4Gb FC, you get only 50 Mb/s instead of close to 400)

thanks

ASKER

jimpen, did you come with the numbers 9 and 6 based on 60/40% ratio? what if it is 24 disks, for example? Could you please point me to any reading material on the logic you mention. It is fascinating.

thanks

The LUN allocation for the DB and the Transaction logs depend on the size of the database and amount of transactions as well as whether you have a maintenance plan that backs up the transaction log through which the size of the LDF file can be controlled/managed. It also depends on whether the backup of the database is stored on the transaction log LUN or you have a separate one.

Back to your HBA issue, You have to check which LUN is on this HBA.
50Mb/s means that the LUN that is accessible through this HBA is not as active as another.

What SAN are you connecting to? Which SAN initiator are you using? I.e. for emc SAN powerpath is running on the server. There are different settings on the adapter but at this point, you have to determine whether the throughput is an issue. How big is the DB and how much memory is on the system?
If your physical RAM is larger than the databases on the server, it is likely that the 50Mb/s is the writeout of the transaction logs while the databases are maintained in memory.

The SAN units I was working with were 15+1 in the array. SQL has always written that way. Log, then mdf then the checkpoint.

If you have large transactions, or many, you may want to have a dedicated drive (or mirror) for the pagefile.sys.

The important file is the mdf. The ldf can be "trashed". But you don't want to lose it as a common practice.

I can't point to documents offhand. I think it is in the Best Practices docs.

ASKER

Arnold
What if there are 3 or more LUNs (tempdb, index, big datafiles) instead of 2 connected to the HBA ports? Does the allocation of which LUNs to which ports happen dynamically or does it get setup by us?

right now data and log are on seperate LUNs and trans log & reg backups happen to tape.

In the 50 MB/sec example, if I take a backup of a 50 GB database, it is still 50 MB/Sec. Should it not be more since there is no caching of a db during a backup? In this case, what is the limitation of the HBA could be? What are the other settings you could think of?

What is SAN initiator? It is not the multipath tool, is it? This is a IBM San.

Mem is not caching much. It is 8 GB. The db in question is 120 GB.

Thanks

ASKER

jimpen,
L->D->L is true for any RAID, right? - be it 10 or 5 etc (Yours was 10?)

You divided 9+6. Did you do this based on transaction basis? But if you transactions were heavy bulk inserts or transactions that will bloat the log, will you have changed the equation (9/6)? or the other way, if there was very very little log writes, would you have increased it to someething like 12/3?

Thanks much

L->D->L is for any MS SQL.

>> You divided 9+6. Did you....... 12/3?

You have the right idea. There are three recovery models Full, Bulk-logged and Simple. In simple, the data goes to the log first, but as soon as the checkpoint/commit hits all that space is freed up. In Bulk Logged you can say for a bulk load it just writes direct to the table(s), just a quick "comment" for the log that it is happening. But your one, two, five row updates/inserts/deletes go through the log. (Note the Truncate Table is not logged other than a checkpoint.) You can have a mix and match of recovery models on one instance and that will affect it as well, as the number of databases.

Tuning of SQL is as much art as science.

The LUN access distribution depends on the software/target that you are using.
i.e. if you have your HBA's in failover mode, you would have to distribute the load manually i.e. when configuring you would have HBA1 -> LUN1 HBA2 ->LUN2 HBA1 -> LUN3 as the preferred and have the other as the failover. In the case of Loadbalancing you would have HBA1 and HBA2 connected to each LUN and the transactions will be sent through via either to the SAN.
I'm not familiar with IBM SANs nor whether they have a software component that gets installed on the servers to manage the HBA's and access to the LUNS.
EMC has powerpath: http://www.emc.com/products/detail/software/powerpath-multipathing.htm

in a backup to tape, your limiter is actually the tape and often you are backing up using volume shadow copy i.e. SQL writer grabs a snapshot and then pumps the data out such that the database does not get locked for the duration of the backup unlike the earlier versions of OS/SQL where your DB is locked until the backup is complete. The backup is also not the top priority. i.e. the system does not allocate all available resources to process the backup at the expense of everything else.

I worked with a unit that sat between the server and SAN. It allowed us to snapshot and replicate the data to our off site. We then backed up the mdf/ldf files (along with everything else).

I did local regular SQL bak/trn files just to have short term point-in-time recovery (3 day) quickly available.

I only had one DB go corrupt on replication, and that was when we were working out of hours.

ASKER

jimpen, I can see how much of an art it is, by the way you are explaining with balance. Thanks.

How did you determine that 60% of your transactions were data and 40% were log? what kind of monitoring did you do to get that analysis?

ASKER

arnold, thanks for the backup tape example. but for backup to disk, do you see more or less throughput, since there is no caching and it is all sequential reads? what is the limiter there.

appreciate your input very much.

>> How did you determine that 60% of your transactions were data and 40% were log?

It was more of a space issue. We knew our databases would grow permanently and the logs could be dumped at some point if we had issues. And every once in a while, say after a quarterly month-end, I would prune them back if they had a lot of "dead" space in them.

Here is a good article on how SQL writes -- It doesn't really get into the HBA Q and throughput there, but you can see the effect from the SQL view.
Writing Pages: http://msdn.microsoft.com/en-us/library/aa337560%28SQL.90%29.aspx

The throughput will be higher and the backup will complete in a shorter amount of time.
The limiter in this case will be the load and prioritization of the system.
The difference can be shown in a an example where an individual is in a warehouse with the following responsbilities:
1) deliver inventory to waiting customers at the counter.
2) move outdate/discontinued products to a separate area (backUP) for this the person either has to carry an item at a time (tape), or has a cart (DISK).
The person is alerted by a two way radio/public anouncement speaker (interrupt) has to drop what is being done unless it is delivering to a waiting customer.

carrying one item at a time will take a long time.
puching a cart will be fastter.

ASKER

jimpen, Thanks very much for helping me understand the connection between the write usage patterns & the Disk IO.