Loss of connectivity to LINUX Server

rbtt
rbtt used Ask the Experts™
on
Hi,

I am running LINUX 4.4, but for some reason whenver i a mperfroming a backup/restore on the server (any heavy network avtivity), network connectivity is lost to the server.

The server is up and functional but it can't be connected or neither can it ping it's owm gateway.

Everytime it happens the following message is displayed in /var/log/messages.

Jul 19 18:24:54 ttecdevnfs1 gdm(pam_unix)[5105]: session opened for user root by (uid=0)
Jul 19 18:24:56 ttecdevnfs1 gconfd (root-14583): starting (version 2.8.1), pid 14583 user 'root'
Jul 19 18:24:56 ttecdevnfs1 gconfd (root-14583): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
Jul 19 18:24:56 ttecdevnfs1 gconfd (root-14583): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 1
Jul 19 18:24:56 ttecdevnfs1 gconfd (root-14583): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2
Jul 19 18:25:04 ttecdevnfs1 gconfd (root-14583): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 0


Thanks
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2009

Commented:
Have you checked if your Ethernet card is working properly ?
did you try changing your ethernet card ?

yesterday, i was trasfering only 500MB data from a from a pc to another pc via  PLugable Hub . as soon as i started the transfer after few minutes, the the link of the pugable hub will be turnoff. its because it was unable to perform its task with heavy activity.

Commented:
Nothing in the kernel log ?

Author

Commented:
Just another node this is an IBM HS20 BLADE Server, in a blade chassis.

Where do i check the kernel log?
Success in ‘20 With a Profitable Pricing Strategy

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Commented:
/var/log/kern.log
Top Expert 2009

Commented:
Kernel log would be in /var/log/messages


when you said, you cant ping to the server

have you checked, if network activitiy on the server is Ok ??

i meant, from server can you ping outside ??

Author

Commented:
That's correct I can't connect to the server remotely and also from the server I can't ping any outside device.  The server is unable to pint its gateway.

Commented:
Can you show us the output of ifconfig, please.
Top Expert 2009

Commented:
3 things

1. when this prblem happended   , is the ethernet is in working condition ??  ( the light of the ethernet blinking?? )
2. ifconfig eth0  ( check if you see any error on the output , Assuming eth0 is the ethernet card)
3. try to restart network see if you can ping from the server  ( service network restart )

Author

Commented:
As requested.

[root@ttecdevnfs1 ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:11:25:4A:0B:E4  
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:3575905 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10979777 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:617418085 (588.8 MiB)  TX bytes:15354462065 (14.2 GiB)

bond0.112 Link encap:Ethernet  HWaddr 00:11:25:4A:0B:E4  
          inet addr:10.33.12.80  Bcast:10.33.12.255  Mask:255.255.255.0
          inet6 addr: fe80::211:25ff:fe4a:be4/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:3402560 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10976406 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:528414472 (503.9 MiB)  TX bytes:15266389098 (14.2 GiB)

eth0      Link encap:Ethernet  HWaddr 00:11:25:4A:0B:E4  
          inet6 addr: fe80::211:25ff:fe4a:be4/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:3487055 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10979775 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:609679089 (581.4 MiB)  TX bytes:15354461917 (14.2 GiB)
          Interrupt:209

eth1      Link encap:Ethernet  HWaddr 00:11:25:4A:0B:E4  
          inet6 addr: fe80::211:25ff:fe4a:be4/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:88850 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7738996 (7.3 MiB)  TX bytes:148 (148.0 b)
          Interrupt:217

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:18172 errors:0 dropped:0 overruns:0 frame:0
          TX packets:18172 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:9980513 (9.5 MiB)  TX bytes:9980513 (9.5 MiB)

Author

Commented:
Once the network service is restarted then connection is established to the server.

A service network restart is done.

Commented:
You have 2 network cards, maybe you could try to use eth1 and see if the problem occurs again.
Top Expert 2009

Commented:
The output you gave, which one is the faulty one ? also, did you gave this output while you are having this problem ??

Defien which one is your faulty ethernet card, also those 3 thigns i asked

Author

Commented:
This is a bonded interface bond0.112 as shown above.

Commented:
Hmmm ... same mac adress, maybe a card with 2 ethernet ports.

Top Expert 2009

Commented:
Ok, so its not the physical Enthernet card issue, its the bond one which is call i guess Alising, is not it ??

what about eth0, is that working fine?? because you have created those bonding from eth0


you said, restarting the network fixed the issue

here 2 thing :

1. is your eth0 workign fine ??
2, is your only bond0.112 giving problem   ??

will you be able to find it out ??
Top Expert 2009

Commented:
The reason i am saying, bonding will depends on physical card in your case its eth0.[ because data will pass via eth0 to your bonding)

so if there is any problem of eth0,, bonding might give problem..

Will you be able to simulate the same probme but use eth0 not bonding ??

it might give you some light which one is giving trouble

Author

Commented:
ifconfig when there is no connection to the server
ot@ttecdevnfs1 ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:11:25:4A:0B:E4
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:10220658 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20383412 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1692313447 (1.5 GiB)  TX bytes:25180799745 (23.4 GiB)

bond0.112 Link encap:Ethernet  HWaddr 00:11:25:4A:0B:E4
          inet addr:10.33.12.80  Bcast:10.33.12.255  Mask:255.255.255.0
          inet6 addr: fe80::211:25ff:fe4a:be4/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:10031269 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20379530 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1456101778 (1.3 GiB)  TX bytes:25017445950 (23.2 GiB)

eth0      Link encap:Ethernet  HWaddr 00:11:25:4A:0B:E4
          inet6 addr: fe80::211:25ff:fe4a:be4/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:10124310 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20383410 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1683933966 (1.5 GiB)  TX bytes:25180799597 (23.4 GiB)
          Interrupt:209

eth1      Link encap:Ethernet  HWaddr 00:11:25:4A:0B:E4
          inet6 addr: fe80::211:25ff:fe4a:be4/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:96348 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8379481 (7.9 MiB)  TX bytes:148 (148.0 b)
          Interrupt:217

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:18419 errors:0 dropped:0 overruns:0 frame:0
          TX packets:18419 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:10012955 (9.5 MiB)  TX bytes:10012955 (9.5 MiB)
Top Expert 2009

Commented:
is this problem started recently  ??

you sure there is not any error in /var/log/messages regards eth0 or bonding ??


ok i will try to get some more info with some command

whats the result of this

 ethtool eth0  
 ethtool  bond0.112

( when Server is stoped working for ethernet)

mii-tool eth0
mii-tool  bond0.112

As i said, it must be your hardware problem, its stopping in high load trafiq



ethtool -S eth0

netstat -i


see , we can find some unusuall output

do all these when server is not working

Author

Commented:
I will try to get informtion once the server goes down again...

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial