Solved

EMC + Emulex HBA + SQL Packet size

Posted on 2006-06-27
11
2,330 Views
Last Modified: 2013-11-15
Hi,
   I have win2003 srvr with an Emulex Lp1050 going to a EMC Clariion. The Sql packet size is 4096 bytes with the max of 32xxx bytes possible. I want to increase it in SQL. Do I have too and how can I increase the packet size for the HBA and/or EMC so that the increase in SQL can be accomodated? I know how to do in SQl but not on the HBA and EMC.

Thanks.
0
Comment
Question by:sudipmis
  • 7
  • 4
11 Comments
 
LVL 30

Accepted Solution

by:
Duncan Meyers earned 250 total points
ID: 16997542
Leave the HBA settings alone! They're optimised for CLARiiONs, and fiddling with the settings can break things pretty comprehensively. The other thing to note is that the SQL block size may be 4096 bytes, but SCSI commands (and hence, fibre channel) don't really care about what's happening at the OS and application level, so changing the FC card settings is pointless. The other thing to note is that a FC payload is 2112 bytes. End of story.

Before you do anything, read this: http://www.emc.com/techlib/abstract.jsp?id=1444&c=US&l=en then go into Navisphere, right click on the top level and select "Properties", then click on the tick box to turn on Statistics Logging. Keep an eye on the following properties: % dirty pages under the cache page, read and write bandwidth for the LUNs that you're interested in.

If you want to have a look at some performance related settings, you can have a look a write cache page size - try setting it to 4K - this aligns the cache page size to you SQL write size. It make for more efficient usage of write cache at the cost of extra processor overhead to manage the write cache. Default page size is 8K. Beware that this may have a negative impact on the performance of other attached hosts.

You can also check the alignment offset. For a Windows or Intel based Linux host, you set the offset to 63 when you create LUN in Navisphere. You can also use dispar.exe (W2K, W2K3) or diskpart.exe (W2K3 SP1) to set the offset when you create the partition in Windows. http://www.microsoft.com/technet/prodtechnol/exchange/guides/StoragePerformance/0e24eb22-fbd5-4536-9cb4-2bd8e98806e7.mspx?mfr=true has more information if your using W2K3 SP1. Google for articles if you're using W2K.

In my experience, most CLARiiON arrays are under utilized, and disc performance is generally not an issue.
0
 

Author Comment

by:sudipmis
ID: 16997884
meyersd,

  The reason I was inquring about the payload size was to see if the increase would resolve the sql error i recieve during heavy usage of our DB

SQL Error:
SQL Server has encountered 1 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [x:\MSSQL\Data\Data.MDF] in database [Clinical] (12).  The OS file handle is 0x00000408.  The offset of the latest long IO is: 0x0000056dab0000

The drive it is trying to write too is an EMC lun.
Any suggestions?
-Sudip
0
 
LVL 30

Expert Comment

by:Duncan Meyers
ID: 16997937
http://support.microsoft.com/?kbid=897284
http://www.microsoft.com/technet/prodtechnol/sql/2000/downloads/default.mspx
http://sqljunkies.com/WebLog/joesack/archive/2005/06.aspx

all discuss the issue you're seeing. You can get a pretty good ideas of how the CLARiiON is handling I/O by looking at the % dirty pages under the cache tab as mentioned above. If the % dirty pages is hitting 100% then you possibly have an array misconfiguration issue. What model CLARiiON is it?

You can also check I/O bandwidth to individual discs as well as to LUNs by looking at the properties pages and checking the statistics tab. Note that you won't get statistics unless you turn statistics gathering on as detailed above.

If the array is running well, then your next step is to investigate the configuration of the SQL database - there are MS tools available to help with this.
 
0
 

Author Comment

by:sudipmis
ID: 16999997
Thanks. I will check out the links.

EMC suggests using 4k page size for DB app's just as you did but my page size is 8kb. I have a Clariion cx700. Stat logging was disabled so I do not have a dirty page % yet. I will keep you posted.

Thanks.
0
 
LVL 30

Expert Comment

by:Duncan Meyers
ID: 17048442
Hi sudipmis,

How's it going?
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:sudipmis
ID: 17048617
Hi Meyersd,
   
    I changed the page size from 4kb and 8kb but still kept getting an error. I have narrowed it down to a inhouse process that runs from 1am til 3am. Adding a 2nd hba seems like the solution.
SP A has 17% of dirty pages. All toher stat's look pretty normal.

-Sudip

0
 
LVL 30

Expert Comment

by:Duncan Meyers
ID: 17057839
From what you've posted, the CX700 is extremely unlikely to be the culprit here unless you were experiencing forced write cache flushes, and I have to say that I'd be very surprised if a second HBA resolves the issue, as it is unlikely that your server has the I/O bandwidth and processor power to flood the 2Gb FC bus.

One thing you may want to check is that the LUN isn't being trespassed from one SP to the other. Check the SP event logs for messages like "LUN shut down for trespass" during the periods when you're seeing the error. If the LUN is being trespassed without good reason, then you may have a connectivity issue (a damaged FC cable for example) or failover has been incorrectly configured on the CX700.

Finally, the following document describes performance tuning for SQL:
http://msdn2.microsoft.com/en-us/library/ms187830.aspx
0
 
LVL 30

Expert Comment

by:Duncan Meyers
ID: 17057866
Another thing to check is that your Emulex cards have been configured correctly. You must set the driver for 'CLARiiON' and PowerPath when you install it. This document has all the details: http://www.emulex.com/ts/docoem/emc/pdfs/win.pdf pp 66-
0
 
LVL 30

Expert Comment

by:Duncan Meyers
ID: 17063268
Hi Sudip,

Thanks for the points! I was just wondering what fixed the issue?
0
 

Author Comment

by:sudipmis
ID: 17063524
Hi Meyersd,
   
     Although, we did not come to a true solution, we were able narrow down to the exact process which caused this error. We run a nightly extraction from our DB. During the process, it creates a temporary view from one of our larger tables (almost 2 million rows) among some other smaller ones. The extensive amount of reads and writes was triggering the events. We stopped the process for a couple of nights and all was quiet. So, we decided to separate that process on to a separate server. Thanks for you excellent help.
-Sudip
0
 
LVL 30

Expert Comment

by:Duncan Meyers
ID: 17064626
My pleasure!
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Storage devices are generally used to save the data or sometime transfer the data from one computer system to another system. However, sometimes user accidentally erased their important data from the Storage devices. Users have to know how data reco…
A Bare Metal Image backup allows for the restore of an entire system to a similar or dissimilar hardware. They are highly useful for migrations and disaster recovery. Bare Metal Image backups support Full and Incremental backups. Differential backup…
To efficiently enable the rotation of USB drives for backups, storage pools need to be created. This way no matter which USB drive is installed, the backups will successfully write without any administrative intervention. Multiple USB devices need t…
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now