Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Intermittent network connection

Posted on 2009-05-16
5
Medium Priority
?
913 Views
Last Modified: 2012-06-27
I have the following network setup:

4 Hosts (Debian, Ubuntu, 2x Freebsd) on a Gbit-Ethernet switch uplinked to ->
a 24-port Fast-Ethernet with a variety of heterogenous hosts in a local lan.

All of the hosts are in the same class C network and use the same gateway and dns server.

I recently added an Esxi server with 2 gbit-Ethernet cards  to the network, with one card configured as a dedicated management interface, connected to the Fast-Ethernet switch with an IP in the local network. The other card,  used by the VM-nets, is connected to the Gbit Ethernet switch. (I have also tried it the other way around)

The problem: the management network is intermittently unreachable from all but ONE machine, a FreeBSD machine on the same switch. Really weird. I can log into to any other host on the same switch and sometimes have a good connection, sometimes   a "no route to host" when I try to ping the management interface. But the connection from the one FreeBSD box on the same switch is rock stable.

I have, of course, tried replacing cables and every possible  switch <-> host combination.

Any ideas on what is going on or how to troubleshoot this?

Thanks!
0
Comment
Question by:alpha-lemming
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
5 Comments
 
LVL 10

Accepted Solution

by:
lanboyo earned 750 total points
ID: 24402896
Is the ESXI server gigabit interface trunked? Make sure the management vlan in removed from the gigabit link on the switch side.

Is the loss of the management vlan just the esxi interface or the loss of a whole vlan used for management of things that include the esxi?


Off the cuff, I notice that the default arp cache timeout for free bsd is 20 minutes, while it has a maximum of 10 minutes in windows. A possibility is that something occurs that prevents the esxi box from responding to an arp whois request or that prevents the rest of the network from hearing the responses.  Or, for some reason the esxi has decided it's management interface is better on the other interface and the ip needs to change mac addresses, somehow the more robust code on the bsd device is able to notice this and adapt. Perhaps it sees gratuitous arps better.

Anyway...

When the problem is not occuring go to a windows box and do an

arp -a

Find and note the mac address that corresponds to the ip address of the esxi. This is an HP printer at my home network for instance.

  192.168.1.7           00-17-08-87-44-84     dynamic

It's MAC address is 00-17-08-87-44-84 . Check on the BSD device,  with the command "arp -an"  the n is to not do dns reverse lookups, which speeds things up usually. The response is a little different,

? (192.168.1.7) at 00:17:08:87:44:84 [ether] on eth0

But although the :- separators are different it is the same mac.

1st, make sure the macs are the same. 2nd, do the same thing when the problem is occuring, from a workstation that has the problem and the working BSD.

Do they booth have arp entries? Do they match? Do they match the previous address?

Also, the no route to host error usually means that the local router is unable to get an arp response, and sends that error back. Where is the local router. Is the BSD box or any of the other boxes dual homed? Is the management vlan the same class c? How to the boxes try to connect? SO many questions.  








0
 
LVL 10

Expert Comment

by:lanboyo
ID: 24423798
Any updates?
0
 

Author Comment

by:alpha-lemming
ID: 24436595
Sorry for the delay, had to go out of town..

No, the hosts that cannot connect do not have or get a mac address for the management nic

arp <host> spits out the ip addres, then "no entry" for the mac...

The management interface is not in a VLAN and failover/loadbalancing with the other nic ist turned off.

It's just weird that this one BSD box has a rock solid connection while all the others are flaky..

0
 
LVL 2

Assisted Solution

by:ENCOSE
ENCOSE earned 750 total points
ID: 24490237
sounds like a speed/duplex mismatch...
try checking EVERY device port and switch port to make sure they are all on Auto/Auto.

a common misconception is that one side can be manually set with the other on auto/auto... which does not work properly


Josh Kwok, MCSE, CCNP
ENCOSE
0
 

Author Comment

by:alpha-lemming
ID: 24533771
All the nics are in Autonegotiate mode.
I found the culprit, although I don't know the exact cause yet.
Shutting down one of the other hosts, which is running vmware-server makes everything work right. Maybe I had duplicate macs or something...
0

Featured Post

Plesk WordPress Toolkit

Plesk's WordPress Toolkit allows server administrators, resellers and customers to manage their WordPress instances, enabling a variety of development workflows for WordPress admins of all skill levels, from beginners to pros.

See why 2/3 of Plesk servers use it.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article is a collection of issues that people face from time to time and possible solutions to those issues. I hope you enjoy reading it.
This article explains the fundamentals of industrial networking which ultimately is the backbone network which is providing communications for process devices like robots and other not so interesting stuff.
Internet Business Fax to Email Made Easy - With  eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, f…
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question