Link to home
Start Free TrialLog in
Avatar of truesaer
truesaer

asked on

WSAEINVAL error received when using select() with sockets

I'm using the select() function to prevent a blocking socket from hanging on recv() in the case of a dropped UDP packet.  The problem is that whenever I specify a timeout in my timeval structure that is less than necessary for data to be ready in my socket, instead of getting a return value of zero (in which case, I can pause or loop or something), I get SOCKET_ERROR, and WSAGetLastError() gives me 10022, which cooresponds to WSAEINVAL.  This should only occur if my timeout is an invalid value (I can't find anywhere that documents what is or is not valid), or if the three socket set structures are NULL.

If I set the timeout value high, all my operations complete successfully.  If I set it very low, it immediately fails with WSAEINVAL.  I initially thought that perhaps there was a minimum value, so I started setting it lower and lower to find the minimum, but I found at 15000 microseconds it wouldn't fail consistently.  In other words, I try to send 5 consecutive packets and sometimes it would fail on the fourth, sometimes the second, sometimes the first, sometimes not at all.  This is how I concluded that if it times out it is returning the error instead of 0.

So I'm not really sure what I'm doing wrong.  My ultimate goal is to determine how long I should reasonably wait for a response (by experimenting with the timeout value).  Then I can set a timeout value where I can resend the request packet if I don't receive a response in a reasonable amount of time.

Also, according to MSDN, a timeval of {0, 0} should cause select to block until a response is received.  But I still get the WSAEINVAL error.  Setting this parameter to NULL in the select() call works correctly and it blocks until a response is received.

Here is a code snippet:

send(*theSocket, (char *)request, msgSize, 0);


     //Wait for a response
     unsigned count = 0;
     FD_ZERO(&fdread);
     FD_SET(*theSocket, &fdread);
     fdTime.tv_sec = 0;
     fdTime.tv_usec = 10000;
     while (count < 100000)
     {
          if ((fdRet = select(0,&fdread,NULL,NULL,/*NULL*/&fdTime)) == SOCKET_ERROR)
          {
               if (WSAGetLastError() != 10022)          //this error means select timed out
               {
                    printf("Socket Error %d on select().",WSAGetLastError());
                    exit(1);
               }
          }
         
          printf("fdRet: %d\n",fdRet);

          if (fdRet == 1)
          {
               printf("response ready after %d select calls, with wait times of %d milliseconds.\n",count,fdTime.tv_usec);
               break;
          }

          count++;
     }

     responseSize = recv(*theSocket, (char *)response, 200, 0);
Avatar of DanRollins
DanRollins
Flag of United States of America image

I've never worked with the timeval so this is a shot in the dark:  

   15000ms is really 15 seconds and 0 ms

Does it make any difference if you set it that way?

still checking for other leads...
-- Dan
oops I see its micro seconds.

There are 1000 microseconds in a millisecond
and 1,000,000 microseconds in a second

so a value of 15000us is 15ms... which *might* be nearing the resolution of some internal timer (but I don't know why that would be the case).  Still checking...
-- Dan
Another possibility:  Select might be updating your fdTime variable (it will in some implementations).  But in your loop, you are not reseting it to {0,10000}.  Give it a try.

-- Dan
Avatar of grue
grue

According to my documentation (VS .NET), you'll get a WSAEINVAL (10022) under these circumstances:

"The time-out value is not valid, or all three descriptor parameters were NULL."

I tried your code on an socket accepted from a listening server socket.  I did not encounter the problem you were encountering when calling select().  

When it times-out properly select() should return 0, not an error (meaning no filedescriptors have been selected).

Setting the timeval to { 0, 0 }, or even negative values, did not produce an error, so I'm not sure what MSFT means by a "not valid" timeout.  It didn't block indefinitely either (it returned immediately for zero or negative times).

The only way I was able to produce a 10022 was by passing NULL for all three fdsets... Are you sure you're looking at the right section of your code for the error?  

Regarding Dan's comments about reducing the microseconds, it seems to work (for me) either way -- setting the timeout to a large microsecond value is equivalent to setting it to a small second value.

Anyway, this is what I was running:
 
SOCKET acceptedSocket = (SOCKET) userArg; // from a function

fd_set fdread;
struct timeval fdTime;

while (true)
  {

  FD_ZERO(&fdread);
  FD_SET(acceptedSocket, &fdread);

  fdTime.tv_sec = 0;
  fdTime.tv_usec = 10000000;

  int fdRet;
  if ((fdRet = select(0,&fdread,NULL,NULL,&fdTime)) == SOCKET_ERROR)
    {
    printf("Socket Error %d on select().",WSAGetLastError());
    }

   // ...
   }
ASKER CERTIFIED SOLUTION
Avatar of grue
grue

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Just a follow-up, you don't need to reinitialize timeval for every call to select() per the following documentation from Visual Studio .NET's help files (which also contradicts what you said about initializating timeval to {0,0}; maybe you need a new MSDN):

When select returns, the contents of the TIMEVAL structure are not altered. If TIMEVAL is initialized to {0, 0}, select will return immediately; this is used to poll the state of the selected sockets. If select returns immediately, then the select call is considered nonblocking and the standard assumptions for nonblocking calls apply. For example, the blocking hook will not be called, and Windows Sockets will not yield.

Avatar of truesaer

ASKER

I think you're right, Grue, about the problem being with my not recalling FD_SET in my loop.  I thought this was occuring on the first call of select(), but it must have been happening on a subsequent call.  I'll test this to confirm and let you know later today or perhaps tomorrow.