?
Solved

iSCSI Initiator cannot work well with Net APP

Posted on 2011-10-10
21
Medium Priority
?
3,708 Views
Last Modified: 2013-01-13
I'm using 2 PCs name PC1 and PC2 for MS SQL Server 2005 Failover Cluster. Both of them have the same hardware configurations and were installed:
 - Windows Server 2008 SP2
 - iSCSI driver 6.0.6002.18005 (already update hotfix KB2522766)
 - Using Net APP 6080 for disk sharing
 - IIS 7 for webserver
Now, I'm facing the problems iSCSI initiator auto reset target and disconnect Net APP disks.
Event ID 129
The description for Event ID 129 from source iScsiPrt cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\RaidPort1
the message resource is present but the message is not found in the string/message table
Event ID 39
Initiator sent a task management command to reset the target. The target name is given in the dump data.
Event ID 49
Target failed to respond in time to a Task Management request
Event ID 9
Target did not respond in time for a SCSI request. The CDB is given in the dump data.

Do you have any ideas for these errors?
Thanks for your help.
0
Comment
Question by:rvc-it
  • 11
  • 6
  • 2
  • +2
21 Comments
 
LVL 8

Expert Comment

by:barrykfl
ID: 36946451
Any firware updated Net APP DIsks ...some times it fix the connection of ISCSI.
0
 
LVL 11

Expert Comment

by:Paul S
ID: 36946792
Did you contact NetApp support? Firmware upgrade if available is always good. If you do a contact ping of the iSCSI ip of the netapp device is it always available?
0
 

Author Comment

by:rvc-it
ID: 36946893
Thanks for the feedback and hint. However, we're using latest version of NetApp ONTAP OS (NetApp Release 8.0.2 7-Mode). Our iSCSI IP is always available because in this NAS we do have other LUN that are working stably with Linux boxes. Anyway I'll try to contact NetApp support to see they have any idea about this problem. In the mean time, if you see other possibilities of problem please tell me. Thanks!
0
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

 
LVL 22

Expert Comment

by:robocat
ID: 36948646

The Netapp system log might provide a clue.

You can find this in the NetApp system manager->filername->configuration->system tools->syslog

Look for any iSCSI related errors.
0
 
LVL 1

Expert Comment

by:DougD
ID: 36953591
You've mentioned the iscsi driver version, but what iscsi initiator software version are you running?
Output of "iscsicli version"?

Also, have you set your registry values for disk timeout?
0
 

Author Comment

by:rvc-it
ID: 36953615
Hi DougD,

We are using iscsi driver version 6.0.6002.18005. We also didnot set register values for disk timeout. Can you guide me how to do it?
0
 
LVL 1

Expert Comment

by:DougD
ID: 36953863
So you are using OS builtin drivers for iscsi.
If you have access to Netapp NOW consider KB ID: 3011459 as it discusses increasing disk timeout values.
Else, set the registry (you may have to create it - default is 10)
[HKLM\SYSTEM\CurrentControlSet\Services\disk]

"TimeOutValue"=190"

We use 190 to allow for cluster takeover/giveback, i don't think you've mentioned filer clustering.
0
 

Author Comment

by:rvc-it
ID: 36966164
I tried change registry value as your advice but this problem still happens. Do you have any ideas?
0
 
LVL 1

Expert Comment

by:DougD
ID: 36966426
And you have rebooted to pick up the new registry value?
0
 

Author Comment

by:rvc-it
ID: 36966434
Yes, I did. But it's not effective
0
 
LVL 1

Expert Comment

by:DougD
ID: 36966464
What is your network settings between host and filer?  Are you running jumbo frames?
You should check that your interface connection settings match the filer and any switch in between supports those settings.
If you have jumbo frames set at your host and filer check that you can actually ping up to that frame size from end to end.
ie if you are set to 9000 MTU try
ping -l <MTU_size> <IP_address>
 
0
 

Author Comment

by:rvc-it
ID: 36966495
I set default jumbo frames (1500) for host and filer. I can ping from host to filer.
PS: I deployed MS SQL Cluser 2005 on 2 node NODE1 nad NODE2. NODE1 is active (it keep all LUNs).NODE2 is passive. Sometime, NODE1 thought out error with iscsi port and tranfer all LUNs to NODE2.
0
 
LVL 1

Expert Comment

by:DougD
ID: 36966583
Firstly, 1500 MTU is not Jumbo frames; by definition jumbo frames are >1500.
Secondly, I think you need to get some confidence in your network connectivity between host and filer.  So let's get back to basics.

BTW, what's your topology, is there a switch between host and filer?
Are you running 100Mb or 1Gb? (N.B.  you can only have Jumbo frames over GbE)
How about troubleshooting and confirming that there are no network issues below iSCSi first. So start with:
- physical layer (interface errors at each end) check Event viewer for any network errors other than iSCSi.  And if you have access to the filer check "ifstat <if_name>" and look at the output.
- try ping tests (extended) end to end up to the maximum MTU size to see if there are any drops and if so do they occur at the same time you have iSCSi timeouts.

Like I said, get some confidence in the network between host and filer.  You haven't mentioned whether this affects both WIndows hosts or not.


0
 
LVL 22

Expert Comment

by:robocat
ID: 36973401

Have you checked the netapp system log yet ?
0
 

Author Comment

by:rvc-it
ID: 36977280
Hi DougD, robocat,

Thank you for your support,

I'm sorry. I forgot to let you know our network technology:

NAS (iSCSI) ---copper 3G port channel--- Cisco 4510 -------fiber----- Cisco 4506 --copper1G-- servers

The NAS does servers LUNs for other Linux boxes, it works fine in this area.

We have premium support of our NetApp, I asked for their support to analyze the system logs but they have not figured out any strange point.
0
 
LVL 1

Expert Comment

by:DougD
ID: 36977315
Ok so that's the physical description, cool.
When you say that your Linux servers work fine, all that says to me is that the configuration between your Linux host and the filer is all good.  And, that the network configuration of the filer is all good.

So, on the filer interface for your windows host, run:
ifconfig <if_name>
ifstat <if_name>
ANd for the windows host, collect:
Speed, duplex and frame size.

Then compare the results to ensure they match.
Then try running ping tests as I said earlier right up to the MTU size and observe the behaviour of your network.
0
 

Author Comment

by:rvc-it
ID: 36977465
Here is the result of "ifconfig" and "ifstat" command run in filer

nas-02> ifconfig vif-prod
vif-prod: flags=0x22f4c863<UP,BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        inet 192.168.1.152 netmask 0xffffff00 broadcast 192.168.1.255
        partner vif-prod (not in use)
        ether 02:a0:08:0f:61:fc (Enabled interface groups)

nas-02> ifstat vif-prod
-- interface  vif-prod  (32 days, 16 hours, 24 minutes, 14 seconds) --
RECEIVE
 Total frames:      177g | Frames/second:   12680  | Total bytes:       157t
 Bytes/second:    15997k | Multi/broadcast: 22745k
TRANSMIT
 Total frames:      188g | Frames/second:    8393  | Total bytes:       192t
 Bytes/second:     3469k | Multi/broadcast:  2380k

In Windows host, Speed & Duplex set "Auto", frame size is default.
I tried ping from Windows host to filer with MTU size 1500, 9000 and maximum 65500. Everthing is OK, the filer reply within 1-2 sec
0
 

Author Comment

by:rvc-it
ID: 37022012
Everyone, do you have any solutions for this problem?
Thank for your help.
0
 

Author Comment

by:rvc-it
ID: 37135884
After changing LUN Protocol Type to Windows 2008 and upgrade Windows Server 2008 to R2, everything worked fine. However, I'm not sure it's stable, I'll inform everybody the final result one week later.
0
 

Accepted Solution

by:
rvc-it earned 0 total points
ID: 37186962
The problem still happen. Do everyone have any ideas? Do we have any hot fix of Microsoft to fix it?
0
 

Author Closing Comment

by:rvc-it
ID: 38771649
I want to close this question because I no more solution and now our team use another way to have data connection between netapp and ESX server
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

We were having a lot of "Heartbeat Alerts" in our SCOM environment, now "Heartbeat" in a SCOM environment for those of you who might not be familiar with SCOM is a packet of data sent from the agent to the management server on a regular basis, basic…
Many companies are making the switch from Microsoft to Google Apps (https://www.google.com/work/apps/business/). Use this article to learn more about what Google Apps has to offer and to help if you’re planning on migrating to Google Apps. It is …
The viewer will learn how to create a normally distributed random variable in Excel, use a normal distribution to simulate the return on an investment over a period of years, Create a Monte Carlo simulation using a normal random variable, and calcul…
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question