Avatar of Julian Hansen
Julian HansenFlag for South Africa asked on

TCP/IP Socket - connection close results in data lost

We have a custom built micro PC that is connecting to home base using WiFi. We are using the Zentri WiFi module which comes with its own TCP stack. Connection to the server is done using HTTP on port 8087. The micro PC also includes a GSM module that is used to connect to the same server when not in WiFi range.

Server: Linux / Apache / PHP

Request to server is a simple POST request - data in - PHP script processes and sends a response.

Everything works fine up until the response. When connecting over WiFi we don't see the return. POST data is received by the PHP script and entered into the database - so that side is fine. The return buffer however is not "seen" by the WiFi module. All other means to connect to the server work as expected (Browser, C/C++ test program, GSM module) - no issues.

We suspected it might be a timing issue so we added a 1 second sleep() in the PHP script immediately after output of the return buffer - this works - the Zentri module "sees" the response.

It seems that what is happening is that Apache is sending the buffer and then issuing a close() - and the client is seeing the close and ignoring the buffer.

My question is - is it possible that there is something we are not doing in our code - some option that is not being set / is being set that would cause this - that other systems might be more lenient about or is the most likely scenario a bad TCP/IP implementation?
TCP/IPC++Apache Web ServerC

Avatar of undefined
Last Comment
Julian Hansen

8/22/2022 - Mon
noci

Your assessment about the close is correct.
Ever thought of flush() to force the output before quitting.  
http://php.net/manual/en/function.flush.php
ASKER
Julian Hansen

Thanks,

Will look at flush but was interested to know what it is on the WiFi side that is causing this when it does not happen with any other client. We don't need flush / sleep for the GSM module or a browser connection so there is something specific to this TCP/IP stack that is not handling the disconnect properly. I would like to determine if this is an issue with their stack or our use of it.
SOLUTION
noci

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
noci

@Bigrat: Eh a read from the CGI stdin interface?
This is about the output data sent back to Apache that is getting lost.
In plain CGI there is no more reading after the data is received from Apache. A flush() of the data seems more appropriate. If the module quits apache may close the socket.
If the socket gets closed (by apache) the TCP stack MAY discard all untransmitted data.

A flush()  would ensure the data is pushed out the door.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
ASKER
Julian Hansen

This is about the output data sent back to Apache that is getting lost.
Not sure if the above is a typo but the response FROM Apache is being lost unless a delay is enforced between output (in the script) and termination of the script.
Consider a standard REST service on a server - you call it with a GET / POST / PUT and it returns data.
In this case we are sending data from Zentri to service with a PUT.
Script reads data (successfully) and outputs a summary record. This is not being detected on the client (Zentri). We see the socket close but there is nothing in the data buffer. The module uses a mechanism of raising a bit when data is ready to read - this never happens (without the delay) - it is as if the close supersedes the pending data and ignores it.

I believe this is a problem with the Zentri module - but wanted to check that there was not something obvious we might be missing.

With respect to flush() - this may work (have not tried it yet) but it is putting the solution in the hands of the server. There are millions examples of this working perfectly every day - every browser does exactly the same thing - no problems - just this module chokes on the close.

Solving this on the server does not really answer the question as we need to know why this particular client is not behaving as it should.
SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
BigRat

@noci: I have never ever known Apache to have problems with sending data in ConnectionClose mode. The idea of adding a delay before sending a close may work, but I see no reason why one would have to do it. The HTTP protocol invented by Tim Berners-Lee always had a connection close after every http request and if TCIP simply threw away data in a write/close sequence the internet would never have started.

It seems that the client interface is asynchronous and the client must poll for data. Is the close event similar or is it a forced procedure call? Would it then be possible to check on close whether the data bit is set?

I would pass the problem onto Zentri, because from their web site I would not expect to have to mess around with server side delays. They seem to be an experienced company.
noci

Flush() is not putting everything in hands of the server,  it is forcing all writes to be completed before the returning from flush. (Standard for write  is put everything into a buffer (network stack, disk cache) and return immediately.)

So flush() before a close should ensure data is underway before returning from flush().....
(The flush needs to be done on a clientside).
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
Julian Hansen

Flush is not the answer - it does not answer the core question which is why this only fails on this client. The script is fine, the server is vanilla apache - neither of those can change - and they should not need to as they are working fine with every other client we have tested.

@BigRat
Would it then be possible to check on close whether the data bit is set?
Answer: yes we do check and the bit is NOT set. We detect the close but effectively what the client library is saying is that there was no data - or it is not recognising there is data. This has been tested fairly extensively with repeated results.

@sarabande - file / memory output - not sure how that solves the problem. We have a standard PHP script (that works) on a standard Apache server - that if there was a problem we would not be the first to find it. We have no control of what happens after the script terminates - Apache handles that - all we can do is delay termination - which appears to solve the problem in that the client sees the data before the close. This is not a solution for us as we need to understand why this is happening so we don't end up in a situation in the future where this comes back to bite us because of some other issue in the library that requires a work-around.

From what I am hearing here the implementation in terms of what we are trying to do is standard and correct - the most likely culprit is the Zentri library?
BigRat

You could first try WireShark or Fiddler to see EXACTLY what comes down the wire.

The problem is most probably non-compliance with the HTTP standard in the library.
One thing you might try, if at all possible, is to output a Content-Length header. I presume the PHP script outputs a content-type. Normally Apache will add a content-length header OR chunk the transfer. If you add a header Apache won't chunk it. This might be difficult since it involves buffering all the data in order to calculate the length.

Another trick would be, again if at all possible, to downgrade the protocol to say HTTP1.0 or even better HTTP/0.9. In this way chunked transfers are out and the protocol MUST close the connection every time. That might just trigger the correct behavior.

Another trick might be to do the opposite. Continue to use HTTP1/1 but add on the client side a Keep-Alive header. Apache won't close the connection and you can close the connection after receiving the data by simply dropping the socket.
ASKER
Julian Hansen

@BigRat - we have already been through this process while building the script for other clients - everything is good - content-length is set correctly data is being received by client's and processed.

I am going to go with the Zentri library is not up to scratch - we are in discussion with the support but wanted to make sure we were not missing something obvious.

Thanks to all for response  will close shortly.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
ASKER CERTIFIED SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
Julian Hansen

The solution was to drop the polling interval. Not obvious from the point of the experts responding but their comments helped us to think about the problem in a different way and get to the answer.