VMWare 5: Effect of having unreachable iSCSI server locations

We are migrating from Dell ESX VMWare 4 to Cisco UCS ESXi VMware 5 and been running into some horrendous results when trying to do say a storage vmotion with UCS or even just powering up a server on UCS that was unregistered from the old vCenter and now trying to bring into the new.  Anyhow I wondered if an expert could help me understand this piece.  Each ESXi in UCS have iSCSI initiator server locations in the Static Discovery tab.  See attached.  But it looks to me like the consultant put in both the IP addresses which should be reachable from the UCS/ESXi and the addresses which should only be accessible from the older ESX VMW4 environment.  Is it possible this is what is causing problems when trying to access that data store?  Under no circumstances should the UCS be able to reach any of the four 172.16.0.x addresses.  It should only ever have access to the two 172.16.200.x addresses.  Could it be that the more the UCS demands of iSCSI the more failures there will be because of the wrong addresses in there?  Or would the ESXi hosts automagically know that it could not reach the 172.16.0.x addresses and not try to access those IP?    Also there is configuration in the Dynamic Discovery tab which actually has the correct addresses there.  Which takes precedence the Static Discovery or the Dynamic?  I think I might have found an issue but perhaps it's just beginner-itis seeing a red herring?
LVL 1
amigan_99Network EngineerAsked:
Who is Participating?
 
amigan_99Connect With a Mentor Network EngineerAuthor Commented:
Never figured this out.  We're working around it by building all critical servers anew in UCS on a separate volume on a different head.  Eventually we will shut down all the dell and bring into the UCS and this will get rid of the contention.  What a nightmare the migration was.
0
 
amigan_99Network EngineerAuthor Commented:
Some more info: The Dell ESX VM4 environment is attached via GigE to Netapp FAS3140 and the Cisco UCS VM5 is attached via 10GigE to the same Netapp FAS3140 pair but at different interfaces, different addresses.  Both servers access same datastore/Volume with the expectation of being able to storage vmotion from old environment to new.  But result has not been good at all.
0
 
Danny McDanielClinical Systems AnalystCommented:
I've never tried what you're doing, so I can only speculate that the UCS is seeing the arrays as different storage or is confused about which path to use.  There are no attachments to your posts... did you forget them?

Can you get into technical support mode (i.e. console) and do a vmkping to both addresses and the vmotion IP of the other host(s)?

If you can't do a cold migration then vMotion doesn't have much of a chance, so if you disable the 10GigE connections what does that do to the environment?
0
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

 
amigan_99Network EngineerAuthor Commented:
Oops - must not have completed the file upload.  

If the 10Gig links we down then I believe there would be no access to the netapp at all.  The 10Gig links are the paths to the 172.17.200 addresses.  I mis-typed in the original question.  The old block via GigE is 172.17.0.x and the new block via 10Gig is 172.17.200.x

Haven't tried the console mode yet.
UCS-Config3.jpg
0
 
amigan_99Network EngineerAuthor Commented:
This is the best we could find.
0
 
amigan_99Network EngineerAuthor Commented:
Followup: This is the REAL answer to this question.  It was hard fought.  I hope someone else finds it useful because it was hell to live through.  


http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=di
splayKC&externalId=2007427
0
 
Danny McDanielClinical Systems AnalystCommented:
wow, your symptoms don't even fit the KB...  Did you have to work through the problem with support?
0
 
amigan_99Network EngineerAuthor Commented:
Actually it was Netapp's folks who identified the issue as we were pointing the bony finger of blame in their direction.  Once we disabled the garbage collection - everything that was broken suddenly worked.  I am still breathing a sigh of relieve five days later.  :-)
0
 
amigan_99Network EngineerAuthor Commented:
Another piece of the puzzle came about when I found that it wasn't just vmotion that caused mayhem.  But simply deleting a VM from the VMFS 3 datastores caused the same symptoms.  The vmotion I think worked until it got to the part where it needed to delete itself from the old system.  Then badness.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.