Link to home
Start Free TrialLog in
Avatar of Roger Alcindor
Roger Alcindor

asked on

TCP Retransmission

I am investigating a problem that has arisen after changing a PC (updating to a new Dell PC). the operating system is windows Xp SP3.
The application software was running on the old system without any problems. The application communicates with several Agilent Digital voltmeters using the SCPI protocol over Ethernet (TCP port 5025).  After the hardware was changed, the same application started to encounter frequent timeouts (as timed in the application).
I installed Wireshark on the PC and captured the Ethernet port traffic with a capture filter of
host 10.41.3.123 (and the other IP addresses of the voltmeters). The PC IP address is 10.41.8.98 and the digital voltmeter IP address is 10.41.3.123.
I notice that there are several TCP Retransmissions from the PC according to Wireshark and would like to gain second opinions as to the probable cause.
Since Wireshark indicates no intervening packets between the retries, I am concluding that the retries are being generated by the PC  network card driver (or even the network card itself ?). Wireshark indicates a header checksum error on all the packets sent by the PC but I am assuming that this is because the checksum is being generated by the network card firmware or the windows driver and is not available to wireshark. The network card driver doesn't have the ability to be configured so as to disable checksum error discards.
My conclusion is that the issue is due to the PC network card since the problem was not evident when the original PC hardware was being used.
I attach a Wireshark capture log and ask for comments to either confirm or correct my conclusions.
Range3.docx
ASKER CERTIFIED SOLUTION
Avatar of giltjr
giltjr
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Roger Alcindor
Roger Alcindor

ASKER

What determines the period that elapses before a retransmission occurs when no ack is received ?  I notice that the time between retries increases with successive retries.
I will be performing the checks suggested by Drashiel over the next 2 days starting tomorrow afternoon. I'm not sure if I can capture at any other point on the network as there may be no mirror port on the Ethernet switch and it is doubtful if I could get a port configured as such as this is being operated in a factory production environment and configuring the switch would pose a potential risk.
Thanks for you suggestions, I will get back to you soon.
This is part of the TCP stack.  By design TCP will increase the amount of time between retransmissions.  

The assumption is that there is congestion along the path causing and it wants to wait a little longer each time.

Is the switch a managed switch?  

Can you see what the switch thinks the speed and duplex is for that port?

If the switch is set to fixed something, then the PC should be set for the same fixed settings.
You can also check the Advanced tab on your network card's Configuration and see if there's a checksum offload option there... if it's enabled, try disabling it and see if that stops the checksum errors on the outbound packets in Wireshark.  User generated image
I'm pretty-sure the option to use/ignore checksum in Wireshark's TCP protocol Preferences is only for incoming packets (re-assembly will not be attempted if you tell it to use the checksums and there is a bad one). User generated image
In my absence, someone disabled the on-board NIC and fitted a USB Ethernet adapter which seems to have fixed the issue as there are now no re-transmissions or timeouts in the past 36 hours. I didn't get the opportunity to do any of the checks that you suggested so we still don't know what the root cause was. The on-board NUI was an Intel 82579LM which seems to have been commented on in various web sites where it seems that there have been issues with windows drivers having to be installed in the correct order.
since the machine is in a production environment, it is un-likely that we will be able to revert to the on-board NIC and try to establish the cause. There are no free PCI slots on the PC motherboard.
Thanks for you help,

Roger