Linux Socket Connection Problem

We are running Linux RedHat Rel 5.7; fully updated and patched. We have a C application that makes a socket connection to a display device and writes data and after writing the data it closes the connection. We do this every few seconds. The problem we are seeing occurs about 36 hours after rebooting the server. We find that we are unable to make the socket connections to the display device. We get "Error Code 110. Connection Timed Out". Once we start seeing this problem it lasts for roughly 5 hours and then goes away and everything is ok. During the time we are seeing the issue the application that writes to our display devices is not the only application that is affected. If there is any other application that needs to make a socket connection it also will be unable to make a connection. However if there is an application that had a socket already open, it can communicate without any problems.  It is only new connections that are failing. We have a case open with RedHat and have not been able to come up with any solutions or even what the source of the problem is. Anyone have any ideas?
Who is Participating?
Duncan RoeConnect With a Mentor Software DeveloperCommented:
What I can see:
Client connects to server client sends 52 bytes; server sends 0 bytes; client closes connection (all in a fraction of a second).
56406 is the same.
56407 initiates a connection, but instead of sending an ACK to the SYN frame from the client, it sends a REJ.
56408 is same as 56407
All the same up to at least 56418

The client seems to send REJ frames when the server starts sending extra options in its responding SYN. When it sends options [mss 170], the connection goes ahead. When it sends (initially) options [mss 170,TS val 554278049 ecr 607256576,nop,nop] the response is rejected. The TS value increases slowly from there and the ecr value increases more quickly.

I don't know what these options are, but I suspect they cause the trouble. When it starts working again, do the options stop?
The timed-out message seems to be a bit of a furphy - all transactions happen in a fraction of a second.
nociConnect With a Mentor Software EngineerCommented:
Can you do the following after say one hour of running after a fresh restart, after several hours of running,
and during the stress period?

What does netstat -ant show during a period   (Both  nodes please)
Is it possible to make a tcpdump during of such a session?

nociSoftware EngineerCommented:
Oh also please include a diagram schematic of the connection between those nodes (including routers & switches)
No need to supply configs, and address may be obfusciated but need to distingishable. (same for tcpdump)
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Duncan RoeSoftware DeveloperCommented:
Error 110 is simply "Connection timed out" (from /usr/include/asm-generic/errno.h). That would indicate to me a problem at the other end, except you say that other programs attempting to make connections (to what?) are also affected. Also a hardware problem seems unlikely since you say other different programs can use existing sockets.
Something you could perhaps try during failure mode is a connection to localhost. That would use a different network interface (lo instead of eth0 or whatever) - if you get problems there then I would suspect the OS.
As noci already suggested, tcpdump during failure mode could be very helpful.
nociSoftware EngineerCommented:
@duncan, it might also be a problem with routers/firewalls in between because the more or less need to track traffic. So it can very well not be an endpoint problem as such.

All depending on the network design.
Are you sure that the socket is being properly closed? If the average interval was close to 2 seconds in about 36 hours the application would make 65000 connections and port exhaustion would occur if those ports are dangling. Since ports are in use for other activities, the interval could be more than about 2 seconds when the problem would occur.
Duncan RoeConnect With a Mentor Software DeveloperCommented:
Use lsof -i (as root) to see whether there are thousands of open sockets as suggested by noci
NPPCDNVAuthor Commented:
First thanks for all of the coments. I think there are several things here to work with.
First I will start with the tcpdump.
noci: How long should I let the dump run? Also you mention "both nodes". the other node is an LED display that shows information to a production operator there is no way to capture the tcpdump or netstat from that node. We could use packet capture, but hope we don't have to get to that.
We are using a virutlized server as our test bed. It exhibits the same behavior as the production system. We rebooted it last night at 22:00 so we should see issue occur between 10:00 and 11:00 tommorrow. I will make another tcpdump at that piont and then post the results.
I can supply network diagram if you like but it is a fairly simple in the test system the Linux Server is connected to a Layer 2/3 cisco Core switch. There is another Cisco switch Uplinked via GBIC to this Core switch. The Uplink is a trunk that trunks all vlans. The LED Display is connected to that switch. Both the LED display and Linux Server are on the same VLAN. They never travel over Layer 3 so there is no Router involved. IPTABLES is disabled on all Servers. There is no other Firewall running.
The idea of the socket not properly closing is interesting. We tried looking at this early on but maybe we missed something or weren't looking the right place. I will try some of the ideas that have been posted and then get back with the tcpdump tomorrow. We can go from there. Let me know if you have any more questions.
nociSoftware EngineerCommented:
Just a few sessions should be sufficient. Esp the failing start can be telling.
Please get all TCP traffic as wel as ICMP. It is better to filter on host only.
Ok if the remote (display) node is that basic then that's tough, no netstat (that would be more easy).
(Tcpdump is possible by using a copy/monitor port on a managed switch, or by using a hub).

Ok that is a fairly simple network, most probably the problem is only on the endpoints.
The netstat output would answer that one mostly, as in the normal case closed ports should close after a few seconds.
If not closed a huge op link list should be there. (Stopping the program would cause a mass close of those ports, without a reboot).  
What is the output of ulimit -a for the account the program runs under (get the ulimit -a immediately before running your program).

Can you also verify that all equipment has been configured correctly.
I mean simple hardware is often 10Mbps Half Duplex ; Autonegotiation might workout 10Mbps FD.
A interface FD connected to a HD interface doesn't work well, only with little data, and slowly.
If can be recognized on the FD side because there are a lot of  RUNTS(short frames) & CRC errors.
It is best to not have auto negotiate on this type of network.
NPPCDNVAuthor Commented:
Here are the first set of files before and during the error stat. I have lsof, tcpdump and netstat -ant. When reading the tcpdump WCSDEV1 is the server and SPR1 is the LED display device. The files are labeled either B4 or during. B4 meaning before the error state and during meaning while the error state is occuring. all files were created from a telnet session on the server having the error state while logged on as root. lsofB4.txt lsofduring.txt netstatB4.txt netstatduring.txt tcpdumpB4.txt tcpdumpduring.txt
nociConnect With a Mentor Software EngineerCommented:
There is no backlog of non yet closed sockets (about 24 when normal running, and none when failing)
Also there no list of unclosed socket (netstat & lsof )
I do see you are using telnet, Please try to work secure, do not use telnet, please use ssh.
If you come from windows, putty is your fried...
(Telnet transmist username & password etc. unencrypted over the internet....)

Ok, then the failures: They come from the WCSDEV1... (that resets the connection...)

11:54:52.437805 IP (tos 0x0, ttl  64, id 34890, offset 0, flags [DF], proto: TCP (6), length: 60)
                         WCSDEV1.35696 > spr1.tmport: S, cksum 0xa13b (correct),
                         3671212722:3671212722(0) win 5840 <mss 1460,sackOK,timestamp 135514872 0,nop,wscale 7>
11:54:52.439963 IP (tos 0x0, ttl  60, id 18375, offset 0, flags [none], proto: TCP (6), length: 56)
                         spr1.tmport > WCSDEV1.35696: S, cksum 0xb5c8 (correct), 154890240:154890240(0) ack
                         3671212723 win 680 <mss 170,timestamp 186294112 3405250560,nop,nop>
11:54:52.440028 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto: TCP (6), length: 40)
                         WCSDEV1.35696 > spr1.tmport: R, cksum 0xf2f6 (correct), 3671212723:3671212723(0) win 0

This tells that the connection is started, (SYN) there is an answer from the spr1.tmport (SYN, ACK) and then a reset returned... That should have been an Ack...

During failure does the kernel mention something? resource depletion related? Invalid data returned during connect?
(dmesg  might give some hints)

So now why is the reset returned.....
nociSoftware EngineerCommented:
(Btw, is there a reason to close the link every time, that data transferred effectively is less
(53 bytes) then the extra overhead during each open & close.
NPPCDNVAuthor Commented:
I am wondering if the window 0 is the clue. What are the reasons for the server to set the window to 0?
What are the ways to look at kernel messages and resource depletion?
As far as invalid data returned during connection the programmer has already looked into that and it doesn't appear to be the issue. I think it is something server side.
nociSoftware EngineerCommented:
Just sending the reset is enough to kill the connection. (there is no need anymore for a window),
The SYN + SYN/ACK packets look the same as during the success.

It's the client signaling the Reset with window 0.
The display server sends the syn/ack.
NPPCDNVAuthor Commented:
the window 0 signal is coming from WCSDEV1 correct?
Duncan RoeConnect With a Mentor Software DeveloperCommented:
On the client, ls -ltr /var/log and then tail -100 the most recently modified of messages, debug and syslog. With luck you will see something regarding why the server is refusing connections
nociSoftware EngineerCommented:
@duncan_roe, hence the request of dmesg output during trouble.
(That dumps the kernel log buffers).
Duncan RoeSoftware DeveloperCommented:
Yes they are coming from WCSDEV1. If you less tcpdumpduring.txt and enter /WCSDEV1|R, they will light up nicely.
nociSoftware EngineerCommented:
What is the kernelversion you are using btw?
Duncan RoeSoftware DeveloperCommented:
dmesg would certainly be good to have now that we know we need it from WCSDEV1
Duncan RoeSoftware DeveloperCommented:
@NPPCDNV, can you post the output from dmesg on WCSDEV1 now? Only if it hasn't been rebooted since the problem occurred - otherwise the wanted data is lost.
nociSoftware EngineerCommented:
This is the code executed (in kernel 2.6.39)
/usr/src/linux/net/ipv4/tcp_input.c -- The routine handles the answer that is received during the SYN-SENT state.
(SYN is sent, ACK not yet received)...

static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
                                         struct tcphdr *th, unsigned len)

tcpdump is a bit sparse on the output.
It would be nice to have either a raw packet dump (pcap) or full wireshark packet disect  output of failing packets.
NPPCDNVAuthor Commented:
I can get the raw packet dump if needed. We saw something in Dmesg yesterday related to Bluetooth that shouldn't have been running and we uninstalled it and are testing again to see if it made a difference. For now here is the DMESG that was made during the error state yesterday. If we have not resolved the issue, the next error state should occcur on Friday at 11:00.  dmesgduring.txt
Also to answer earlier question from noci: the socket connection is closed each time by the client because of the LED Display. If messages are being sent to the LED display consistently then all is well. If there is a long pause between messages then the LED display goes into a "dormant" mode and the socket connection gets hung and is a dead socket. We wrote the client so that it closes the socket after writing each message. That way we avoid the dead sockets. dmesgduring.txt
NPPCDNVAuthor Commented:
In regard to the dmesgduring.txt. Can anyone shed light on the entries at the very end that show eth0 entering promiscuous mode?
nociSoftware EngineerCommented:
Ok, sometimes small equipment has almost brain-dead software...

The promiscuous mode entries, do they occur when you watch with tcpdump/wireshark?
Some other tool that needs it?

the eth0: link up message near the bottom, was the cable unplugged or some other event?

You may get a timestamp for it in /var/log/kernel.log (or a file containing kernel messages)
(see syslog.conf kern.* entries. )
NPPCDNVAuthor Commented:
I will setup a packet sniff on the client port to see if the promiscuous mode entries show up before or during the error state. This is the first time I have seen them.

I cannot explain the eth0: link up message. The cable was not unplugged during this time. I do not know what other event could have caused it.

On the test system we did not have kernel logging set to a file. It was set to display on the console of server but commented out. I have changed it to write to a file and we will be able to utilize it the next error state.
Duncan RoeSoftware DeveloperCommented:
promiscuous mode is likely tcpdump being run. If you are sure it wasn't that, investigate further
NPPCDNVAuthor Commented:
I been working with this issue to gather more information. I still have not found the root cause of the issue. This is what I know now.
The promiscuous mode statements in dmesg are from the tcpdump command.
I am running kernel 2.6.18-274.e15
The kernel logging is working now but no kernel messages are logged in the kernel.log just before, during or following the error state.
I have packet captures from the switch port of the Linux server and the LED display from 2 different instances of the error state. These do not reveal anymore about what might be the cause of the issue. These are available if anyone wants to look at them. It tried to upload them in .pcap form but the sight would not allow. I can save them to .txt and upload if anyone would like.
At this point I am wondering if anyone know how I could get more detailed logging related to Networking and TCP/IP off of the server? I would like to see if there is something different here when we see the issue.
Not sure really where else to go from here.
Duncan RoeSoftware DeveloperCommented:
If you rename your pcap file xxx.pcap say, to xxx.pcap.append.txt does EE then accept it? I'll take a look if you can post.
NPPCDNVAuthor Commented:
Here they are. I filtered them down to just the relevant conversation and 2 conversations before error state. display-client.PCAP.append.txt display-Server.PCAP.append.txt
I can provide more if you want to see.
nociConnect With a Mentor Software EngineerCommented:

Your timestamp probably are the issue:

From the RFC:
Jacobson, Braden, & Borman                                     [Page 12]

RFC 1323          TCP Extensions for High Performance           May 1992

      TCP Timestamps Option (TSopt):

         Kind: 8

         Length: 10 bytes

          |Kind=8 |  10   |   TS Value (TSval)  |TS Echo Reply (TSecr)|
              1       1              4                     4

         The Timestamps option carries two four-byte timestamp fields.
         The Timestamp Value field (TSval) contains the current value of
         the timestamp clock of the TCP sending the option.

         The Timestamp Echo Reply field (TSecr) is only valid if the ACK
         bit is set in the TCP header; if it is valid, it echos a times-
         tamp value that was sent by the remote TCP in the TSval field
         of a Timestamps option.  When TSecr is not valid, its value
         must be zero.  The TSecr value will generally be from the most
         recent Timestamp option that was received; however, there are
         exceptions that are explained below.

         A TCP may send the Timestamps option (TSopt) in an initial
         <SYN> segment (i.e., segment containing a SYN bit and no ACK
         bit), and may send a TSopt in other segments only if it re-
         ceived a TSopt in the initial <SYN> segment for the connection.

They are part of the standard...
Appearantly the display cannot handle the timestamp correctly. (if the returned timestamp mismatches the connection should be disconnected.

The accepted connections all have the TS options removed on SYN-ACK.
The returned value doesn't match (it should like in the below example.
 (the ecvr should correspond to the previous val).  I marked one pair bold & one pair italic.

See below: (valid exchange)
14:43:28.667560 IP > S 2889084690:2889084690(0) win 14600 <mss 1460,sackOK,timestamp 404103862 0,nop,wscale 7>
14:43:28.668222 IP > S 241446172:241446172(0) ack 2889084691 win 14480 <mss 1460,sackOK,timestamp 2666628395 404103862,nop,wscale 6>
14:43:28.668328 IP > . ack 1 win 115 <nop,nop,timestamp 404103863 2666628395>

In your first scan: (record 21: TSV=134226994) and the TSER in record 22 = mismatch (607256576).

You can disable timstamping on linux with:
echo 0 >/proc/sys/net/ipv4/tcp_timestamps

(It still is strange that the display starts timestamping after such a time).

You can configure this with sysctl as:  (/etc/sysctl.conf

     net.ipv4.tcp_timestamps = 0

NPPCDNVAuthor Commented:
We still have not found the root cause or solution to the problem we are seeing. The solutions that were provided by the Experts helped to eliminate some of the variables and help us focus our trouble shooting efforts. We are now working with the programmer of the application to do further testing. I don't know whether or not to add this to the knowledge base as we don't have a final solution. I am going to leave that option unchecked and let EE decide. If there is further question on this after we work with the application developer I will start a new topic. Thanks for the help noci and duncan_roe.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.