• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1267
  • Last Modified:

SAN performance problem - RIS copies within this drive do not match

Hello,
I'm running an HP DL360 G4 w/ Windows 2003 Server R2 x64.  It is connected to an HP MSA1000 SAN with redundant controllers and 2/8 switches.  It has a 6-drive RAID 5 w/ hot spare.  Recently, there is an intermittent performance issue where the access to the file shares on this server hang for 2 minutes or so.  I ran the HP ADU last night and the only error I found was that MSA controller 2 is reporting:
SLOT 2 (ID 65536) MSA1000 Array Controller ERROR REPORT:

   SCSI Port 1 Drive ID 0 RIS copies within this drive do not match
   SCSI Port 1 Drive ID 1 RIS copies within this drive do not match
   SCSI Port 1 Drive ID 2 RIS copies within this drive do not match
   SCSI Port 1 Drive ID 3 RIS copies within this drive do not match
   SCSI Port 1 Drive ID 4 RIS copies within this drive do not match
   SCSI Port 1 Drive ID 5 RIS copies within this drive do not match

I can post the full diagnostic report if necessary, but it is quite long.  There are no other red lights or problem indicators that i can see.  The ACU reports everything as fine.  Is this RIS error related to the performance problem?  Thank you in advance.
--David
0
capitaljpn
Asked:
capitaljpn
  • 6
  • 4
1 Solution
 
andyalderCommented:
RIS error will not affect performance but i wouldn't like to reboot the storage with mis-matched RIS. RIS is Raid Information Sector, tells the controller how the disks are laid out so it only normally reads RIS if you reboot or add a disk, You can post log here or send it to HP support, they have a software tool that scans it and looks for errors.

Were these disks originally in a server and then moved? I'm wondering how it works at all with mis-matched RIS.
0
 
capitaljpnAuthor Commented:
I sent the log to HP, and they said the mis-matched RIS error is nothing to worry about and can be safely ignored.  They suggested upgrading the SmartArray controller to the latest firmware, but I don't think that will help as the SmartArray is not connected to the MSA1000.  Unfortunately, I'm at a loss.  The problem still happens intermittently; however, I can't find any errors or indication of a problem.  I disabled virus scanning and auditing in hopes of clearing it up, but it still occurs.
0
 
andyalderCommented:
How do you know it is the MSA1000 that is the problem rather than the LAN etc? One easy way is to monitor disk queue length against disk bytes per second, if you see the queue going up but bytes per second not changing then the SAN has stalled.

There are problems with the latest storport drivers so not only the latest firmware but drivers for the Smart Array are needed, although it's not connected to the SAN it may have an effect, apply the latest Proliant Support Pack to get all the latest fixes.
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
capitaljpnAuthor Commented:
I narrowed-down the problem this past weekend.  I applied the latest firmware and psp, but the problem continued.  I manually chose the MSA1000 controller, but no change.  Finally, I manually chose an HBA path, and that made a huge difference.  HBA1 is fast, but HBA2 is slow.  So I'm thinking it's either the card itself or the 2/8 SAN switch.  I will try plugging both HBAs into the same 2/8 switch tonight and monitor performance on both.  If both are fast, then it's got to be the 2nd 2/8 switch.
0
 
capitaljpnAuthor Commented:
Yeah, it looks like one of the HP 2/8 switches is causing the performance problem.  It's weird, though, because there are no errors reported.
0
 
capitaljpnAuthor Commented:
Unfortunately, the performance problem continues.  HP is coming this weekend to change-out parts in the SAN one-by-one.  I'll close this question...
0
 
andyalderCommented:
Just thought of something, had a switch that kept renegotiating speed so had to fix it to 2Gb rather than leave it on auto. You could see this by the speed lights changing on it though.
0
 
andyalderCommented:
Delete by all means but I would like to know the outcome from HP's visit next weekend.
0
 
capitaljpnAuthor Commented:
Sure.  They're coming the day after tomorrow, so I can post their findings.  My guess is that it's the backplane of slot 1.  When I use the slot 2 path, performance is fine; however, when i use slot 1, performance is terrible.  I've tested the hbas, fiber, and both 2/8 switches.  That leaves the backplane and controller.  I hard-set the preferred controller, and they both checked out (as long as I was using the path through slot 2).  The problem seems to follow the slot, so I'm thinking it's the backplane.  It's gonna be a long Saturday if that's the case.  The backplane doesn't look like an easy part to replace.
0
 
capitaljpnAuthor Commented:
HP engineers came and fixed the problem.  The MSA controller in slot 1 had a bad cache memory card, so they just replaced the entire controller.  Now performance is fine through slot 1.  This proved to be difficult to troubleshoot because the logs didn't reveal this issue.  They had to come, and I showed them the performance problem.

Thank you for your help.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

  • 6
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now