asked on

Do I have a bad NIC?

I have the following in /var/log/syslog:

Apr 13 23:40:46 mail kernel: [294549.983670] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
Apr 13 23:40:46 mail kernel: [294549.983670]   TDH                  <f6>
Apr 13 23:40:46 mail kernel: [294549.983670]   TDT                  <fc>
Apr 13 23:40:46 mail kernel: [294549.983670]   next_to_use          <fc>
Apr 13 23:40:46 mail kernel: [294549.983670]   next_to_clean        <f6>
Apr 13 23:40:46 mail kernel: [294549.983670] buffer_info[next_to_clean]:
Apr 13 23:40:46 mail kernel: [294549.983670]   time_stamp           <111829299>
Apr 13 23:40:46 mail kernel: [294549.983670]   next_to_watch        <f6>
Apr 13 23:40:46 mail kernel: [294549.983670]   jiffies              <11182b12c>
Apr 13 23:40:46 mail kernel: [294549.983670]   next_to_watch.status <0>
Apr 13 23:40:46 mail kernel: [294549.983670] MAC Status             <80083>
Apr 13 23:40:46 mail kernel: [294549.983670] PHY Status             <796d>
Apr 13 23:40:46 mail kernel: [294549.983670] PHY 1000BASE-T Status  <3800>
Apr 13 23:40:46 mail kernel: [294549.983670] PHY Extended Status    <3000>
Apr 13 23:40:46 mail kernel: [294549.983670] PCI Status             <10>
Apr 13 23:40:46 mail kernel: [294549.987388] e1000e 0000:00:19.0 eth1: Reset adapter unexpectedly

Open in new window

And this in /var/log/messages:

Apr 13 23:40:50 mail kernel: [294553.863254] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

Open in new window

Does this all mean I have a bad NIC? Could it be a port on the upstream switches?

ASKER CERTIFIED SOLUTION

Member_2_406981

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Mark

ASKER

My eth1 is also forwarded to to eth0, like the other having the same problem report. I never notices this problem until we started using this server routinely about 2 months ago (actually, didn't notice it really until this week).

I'll try disabling tso after hours and see what happens.

Mark

ASKER

trying `ethtool -K eth1 tso off`. Last "hiccup" at 23:15, 4/14. I set tso off at 23:34. I think I'll try one feature at time instead of the whole `ethtool -K eth1 gso off gro off tso off`. It makes me a bit nervous to shut all this off at once. The referenced link didn't really say what these features actually do (the man page doesn't shed much light, what is "tx vlan acceleration", or "ntuple filtering"?) or why turning these off work. Possibly just guess-work on the part of the blogger. Another respondent reported luck with `ethtool -G eth0 rx 2048`, but again, little insight and probably just guessing.

We shall see ...

Mark

ASKER

Well, it's been more that 24 hours since I did the `ethtool -K eth1 tso off` and haven't had a single "eth1: Reset adapter unexpectedly" error in syslog, nor a single "eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx" in /var/log/messages. What is new though are these:

Apr 14 23:15:37 mail kernel: [379577.173736] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Apr 15 00:21:40 mail kernel: [383546.156569] device eth1 entered promiscuous mode
Apr 15 00:55:47 mail kernel: [385596.344128] device eth1 left promiscuous mode
Apr 15 13:21:55 mail kernel: [430436.382364] device eth1 entered promiscuous mode
Apr 15 13:26:27 mail kernel: [430709.031522] device eth1 left promiscuous mode
Apr 15 19:09:39 mail kernel: [451334.133223] device eth1 entered promiscuous mode
Apr 15 19:15:03 mail kernel: [451658.610595] device eth1 left promiscuous mode
Apr 15 19:15:10 mail kernel: [451665.564584] device eth1 entered promiscuous mode
Apr 15 19:17:08 mail kernel: [451783.851678] device eth1 left promiscuous mode
Apr 15 19:17:13 mail kernel: [451788.630887] device eth1 entered promiscuous mode
Apr 15 22:56:08 mail kernel: [464944.401261] device eth1 left promiscuous mode
Apr 16 00:53:38 mail kernel: [472006.574204] device eth1 entered promiscuous mode
Apr 16 00:54:22 mail kernel: [472049.842007] device eth1 left promiscuous mode

Open in new window

Which, as you can see compared to the 1st entry, started about an hour after I did the ethtool command.

I guess I'll live with this as it seems like the lesser of the evils ... maybe. Can you give me any insight into what this "promiscuous" mode is and whether it's a bad thing or benign?

Likewise, do you have any idea what turning tso off actually does? I hate to just monkey-type "solutions" without knowing why (but I will if I must!)

Member_2_406981

Promiscious mode is a mode where the network card will receive all packets, not just for its own MAC address.

This can be good or bad, depending. Its e.g. caused by network analyzing/sniffer tools like wireshark or tcpdump.

But it also could be caused by trojans/rootkits that try to record network traffic.

It also could be a function provided of the driver to achive special network functionality.

So you should investigate further to be sure its not a bad thing.

Mark

ASKER

OK, I'll post a separate question on this.