Link to home
Start Free TrialLog in
Avatar of Djof
Djof

asked on

OnReceive() not called

I'm writting a client with CSocket. Everything goes fine until , for no apparent reasons, OnReceive() isn't triggered anymore. (Can take a while, 30 minutes or more) You would think the connection failed or something like that, but sending still works.

Is there a way I can debug that, or is there anyone that heared of something similar? Anyone?

I just upgraded to VC 7, and I didn't notice this before, when I was using VC 6.

I set this at easy for now, tell me if it's not enough.
Avatar of williamcampbell
williamcampbell
Flag of United States of America image



  Who is doing the sending? Have they run out of memory or crashed?
Avatar of Djof
Djof

ASKER

Ok, here in detail how my app works.

1) Create the socket
2) Connect the socket
3) Handshake the server (Sending some characters with CSocket::Send and receiving the anwser with CSocket::Receive)
4) Create a CSocketFile and two CArchive, one loading, one storing.

From this point, I use two object of a custom CObject derived class to manage the serialisation of the transmissions. OnReceive triggers a function in my document to serialise the reception object, while user actions triggers the the serialisation and sending of outgoing data.

It works very well, until, like I told, for no apparent reasons, OnReceive simply isn't called anymore. Sending still works at that point. Nothing crashes and the connection still works, and it can stay that way for at least another half an hour. (At which point I stopped testing because I wanted to recompile. But the point is it doesn't affect the connection.)

Thanks.

  As a test you could setup a timer and if you havn't received anything for say 2 minutes close the socket and reconnect.

  If you still don't receive after the 'reset' then you know it's probably your client. Worst case you have a workaround for now.
Avatar of Djof

ASKER

I know it's the client. The server hasn't been coded by me, and I know the it works.

So I think I can say it's a problem with the socket. If I ever catch my client stalling again, I will double-proof that with a packet sniffer.
Avatar of Djof

ASKER

Yup, data is sent from the server.

Increased value to 100 points.

 Do you have memory leak ... check using Task Manager. Are you out of disk space.

 Next step is to eliminate code until the problem goes away.

 Can I take a look at these

 'From this point, I use two object of a custom CObject derived class to manage the serialisation of the transmissions.'

 
Avatar of Djof

ASKER

Here is the outgoing serialisation of them. Remember this code works. The connection does not fail, the server still sends everything, I can still send, etc, etc. It's only the OnReceive() locking randomly.

================================================
From the inside of the serialization function, without all the TRACE() macros:

if (ar.IsStoring())
   {
      // Sending code
   }
   else
   {
      WORD w;
      DWORD dw;

      // Receive transaction header    
      ar >> w;
      m_nIsReply = ntohs(w);
      ar >> w;
      m_nId = ntohs(w);
      ar >> dw;
      m_nTaskId = ntohl(dw);
      ar >> dw;
      m_nIsError = ntohl(dw);
      ar >> dw;
      ar >> dw;
      m_nDataSize = ntohl(dw);
      ar >> w;
      m_nObjectNb = ntohs(w);

      // If the transaction is a reply, and Id is 0, check if we can get the original Id
      if (0 == m_nId && 1 == m_nIsReply)
         m_nId = m_pDoc->GetTaskId(m_nTaskId);

      // Object stuff
      int nObjId, nObjSize;
      int nNumber;
      CString sString;
      char* pBuff = NULL;
      ObjectUser usr;

      // Receive objets          
      for (int iObj = 1; iObj <= m_nObjectNb; iObj++)
      {
         ar >> w;
         nObjId = ntohs(w);
         ar >> w;
         nObjSize = ntohs(w);

         switch (ObjType(nObjId))
         {
         case 'N':
            if (2 >= nObjSize)
            {
               ar >> w;
               nNumber = ntohs(w);
            }
            else
            {
               ar >> dw;
               nNumber = ntohl(dw);
            }
            AddObj(nObjId, nNumber, true);
            break;
         case 'C':
            pBuff = new char[nObjSize + 1];   //Allocate
            ZeroMemory(pBuff, nObjSize + 1);     // Zero it
            ar.Read(pBuff, nObjSize);   // Receive it
            sString = pBuff;
            delete [] pBuff;   // Desallocate it
            pBuff = NULL;   // NULL the pointer
            AddObj(nObjId, sString, false, true);
            sString.Empty();
            break;
         case 'U':
            short nSocket, nIcon, nColor, nUserSize;
            ar >> w;
            nSocket = ntohs(w);
            ar >> w;
            nIcon = ntohs(w);
            ar >> w;
            nColor = ntohs(w);
            ar >> w;
            nUserSize = ntohs(w);
            pBuff = new char[nUserSize + 1];   // Allocate
            ZeroMemory(pBuff, nUserSize + 1);   // Zero it
            ar.Read(pBuff, nUserSize);   // Receive it
            sString = pBuff;
            delete [] pBuff;   // Desallocate it
            pBuff = NULL;   // NULL the pointer
                   
            m_pDoc->m_nSocket = nSocket;   // For self user
                   
            usr.Socket = nSocket;
            usr.Icon = nIcon;
            usr.Color = nColor;
            usr.User = sString;
            usr.IsUserList = true;

            m_pDoc->DisplayUsr(usr);
            sString.Empty();
            break;
         default:
            BYTE b;
            for (int i = 0; i < nObjSize; i++)
            {
               ar >> b;
               // Trace unsupported objects
            }
         }
      }
   }
}
           
  Nothing pops out unless in the default case nObjSize is something huge.

  In case U do you reuse that Socket handle?

  Since it worked in VC6 and you are now using VC7 is your suspicion starting to tend that way?

 
Avatar of Djof

ASKER

The socket is member of my document, if the connection dies, the document is closed (MDI).

And to be honest, I've got no idea where to put my suspitions.

Thanks to be still helping.

 Heres the MFC Code

case FD_READ:
{
DWORD nBytes;
if (!pSocket->IOCtl(FIONREAD, &nBytes))
       nErrorCode = WSAGetLastError();
if (nBytes != 0 || nErrorCode != 0)
       pSocket->OnReceive(nErrorCode);
}
break;

 So you will get no Receive if there are no bytes in the packet (I think you checked this and the server is sending)

 or

 The IOCtl call fails which could mean you NetCard is bogus or the driver has a bug!

 Have you tried a different Net card?

 

Avatar of Djof

ASKER

Upgraded my drivers, I'll see if it still hapens.
Avatar of Djof

ASKER

I had high hopes, but after an hour and a half, with 3 coonections, one got stuck. (See, it's totaly random) I'm going to have someone else test it though. I'm trying to get desperate about this.
Avatar of Djof

ASKER

Sorry, I meant "I'm getting desperate about this".

 We had a problem like this and it took months to solve. Eventually we changed the Net Card to solve the problem, it occured in high traffic situations a connection just closed for no reason.

 I suggest trying 2 or 3 cards (looks like you trying that)
Avatar of Djof

ASKER

I barely have another one, and it's really cheap. I will try though.


 As a test write a small program that just receives data over and over ... write a server side program that keeps pumping data. Even make it multi port. Let it run overnight
..if it doesn't fail then you will be closer to narrowing down the culprit.
Avatar of Djof

ASKER

Did that.

The server was sending a short, containing a number from 0 incremented to 255, then started over (each time calling OnReceive). It looped 65535 times without a problem. That's 16 776 960 call. I think it's a pretty good sample. ;)

That means the problem is with my client. ugh.


 Ok next remove all the serializing code.
Avatar of Djof

ASKER

I can only remove th reception code, because I need to login into the server. I'll try that now though.
Avatar of Djof

ASKER

Um, forgot that won't work either, the server won't send anything more unless I've received everything.

In sockcore.cpp there is this:

if ((nReady == 1) || (nErrorCode != 0))
     pSocket->OnReceive(nErrorCode);

I'll put a break point there next time I catch the problem.

Thanks to be still helping by the way...

 No Prob have a good weekend .. (drink heavily)
Avatar of Djof

ASKER

Mm, can't do that since the server won't send anything if I don't receive it, and it will close the connection after a while anyway.

Okay, the CAsyncSocket::DoCallBack function from which you paste copied a part earlier earlier isn't called at all, and thus, OnReceive isn't either. I'll try to go up the lather that way.
Avatar of Djof

ASKER

Funny how the server will continue to send data when OnReceive and DoCallBack aren't called, but not when I just don't accept to receive anything.

 When you stop receiving put a breakpoint in the function

 CSocket::PumpMessages

 and see where it leads.

 
Avatar of Djof

ASKER

Nothing goes there. :(
Avatar of Djof

ASKER

If the connection is closed remotly, OnClose gets called though.
Avatar of Djof

ASKER

Okay. How come CArchive::IsBufferEmpty() is NEVER not empty? Even if I check at the beggining of OnReceive.

The reception loop is based on the fact that IsBufferEmpty should be false if there is still something to receive. OnReceive shouldn't be called unless there is some NEW data in the buffer, so you have to empty the Archive first. But how do you know when there is still something if IsBufferEmpty is of no help?

 Have you tried CArchive::Flush?
Avatar of Djof

ASKER

Isn't Flush only for storing archives?

Here is my OnReceive.

void CXSocket::OnReceive(int nErrorCode)
{
     TRACE("IsBufferFull\t%d\n", !m_pDoc->m_pArchiveIn->IsBufferEmpty());
     
     BOOL b;

     do
     {
          b = m_pDoc->Get();
          TRACE("IsBufferFull\t%d\n", b);
     }
     while(b);    
}

The Get function returns !IsBufferEmpty(). Both the trace shows the IsBufferEmpty is always true, thus always empty, even if it should be false at least on the first trace. (Since it there is something in it I'm about to get.)

Then I'm pretty sure the locking happens because the !IsBufferEmpty() returned by the Get() doesn't make the loop restart, even when it should, because again, it doesn't tell there is something in the archive.

grr.


  For the sake of experimentation (and sanity) how about using a CFile instead of CArchive. You only have 5 items to store right?

  So change to a CFile and see if the problem goes away .. which we think is realted to CArchive.
Avatar of Djof

ASKER

I'm using the CArchives with a CSocketFile. How would I use a CFile to write to a socket?

  Use CAsyncSocket and CFile

  assuming you have 5 DWORD variables

  DWORD val[5];

  CAsyncSocket cs;
  cs.Connect ( ... )
  cs.Receive ( &val, sizeof DWORD * 5 );

       CFile cf;
     cf.Open ( "SocketData", CFile::modeCreate | CFile::modeWrite );

     //val[0] = 1;
     //val[1] = 2;
     //val[2] = 3;
     //val[3] = 4;
     //val[4] = 5;

     cf.Write ( val, sizeof DWORD *5 );
     cf.Close ();

     cf.Open ( "SocketData", CFile::modeRead );
     DWORD _val[5];
     cf.Read ( &_val, sizeof DWORD * 5 );
     cf.Close ();

        cs.Send ( &_val, sizeof DWORD * 5 );

 This might change depending on the data are you sending back and forth

 
Avatar of Djof

ASKER

I'd rather fix in archive implementation, as they tell everywhere it's the best thing. I've also used successfully direct Send and Receive on the socket before.

Now, as you suggested, you CAN use Flush on loading archives, however, as I understand it, it simply empties the buffer.

if (IsLoading())
{
     // unget the characters in the buffer, seek back unused amount
     if (m_lpBufMax != m_lpBufCur)
          m_pFile->Seek(-(int(m_lpBufMax - m_lpBufCur)), CFile::current);
     m_lpBufCur = m_lpBufMax;    // empty
}

Loading archives start buffering at m_lpBufMax, and the IsBufferEmpty function returns the bool of (m_lpBufCur == m_lpBufMax), so I'm having problem figuring why that doesn't work.
Avatar of Djof

ASKER

I've been using an external buffer to so I could trace it. This exposed that only strings are stored in the buffer. Why? Most probably because I read them with CArchive::Read(...). It makes sense that this is what causes the problem because I've been using this only since I switched to VC++ .net, and that's when I started having reception troubles.

There is nothing in the description of the function at MSDN that says there should be a problem using this function for network purpose but I'll change that and see if it fixes my problem.

  Would be nice to fix this one.
Avatar of Djof

ASKER

Wasn't that. Both MSDN CSocket sample programs (CHATTER and its server) also have the IsBufferEmpty() function always telling true.

I'm totaly lost and desperate.
ASKER CERTIFIED SOLUTION
Avatar of williamcampbell
williamcampbell
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Djof

ASKER

I'll work on that, I still have to use all my network code though, because I need to stick to the protocol.

On another side, after my friend did some testing with me, we found another interresting information about that bug, even if it doesn't help me figure where it happens in my code. The connection I use for testing is a 5Mbit/s cable, and his is a 56kbit modem connecting at around 35-40kbits due to poor phone line quality. We've found out that he is affected a lot more than me. Most of the time, he gets the bug within 10 minutes while it can take hours for it to happen to me. That would confirm that it has to do with the buffer, I think. The lag of his connection would make it more likely that multiple transmission get into the buffer as once, or something like that.
Avatar of Djof

ASKER

I finaly found out I wasn't alone with this problem.

"I am using the NDK under VC7 on an XP machine to do a client/server program. My problem is, that after awhile, my Client stops receiving messages from the server."

The NDK is a group of client-server classes based on CSocket/CArchive. Check the posts at the buttom of the page:
http://www.codetools.com/internet/ndk.asp?df=100&forumid=1156&select=516242&msg=516242#xx516242xx
Avatar of Djof

ASKER

Fixed.

Quote from Microsoft Knowledge Base Article - 185728
(http://support.microsoft.com/default.aspx?scid=kb;en-us;185728)
----------

In Windows Sockets, you should not make multiple recv calls within an FD_READ notification unless you are willing to disable FD_READ notifications prior to calling recv. However, CSocket and CAsyncSocket make no provision for doing so. Therefore, you should make only one Receive call per OnReceive function. Under high data transmission rate, if you make more than one Receive call in the OnReceive function, the application might lose FD_READ, have fake FD_READ, or have no FD_READ (hanging).

You can use CSocket with CArchive and CSocketFile to directly receive and send MFC CObject-derived objects. However, under high data transmission rates, you should not use CSocket with CArchive and CSocketFile within the OnReceive function because they might internally generate multiple Receive calls.
----------

So a simple fix is to disable FD_READ notifications during the span of the On Receive callback.

void CMySocket::OnReceive(int nErrorCode)
{
      // Remove read notification
      VERIFY(AsyncSelect(/*FD_READ | */FD_WRITE | FD_OOB | FD_ACCEPT | FD_CONNECT | FD_CLOSE));

      // ... Receive using CArchvive & CSocketFile

      // Set notifications to default
      VERIFY(AsyncSelect());
}

Note that if you disconnect and delete the socket during the your reception code, make sure you do not call AsyncSelect(), or you will assert.

williamcampbell, thanks for support. Just post another comment, and I will give you points for your help.
Avatar of Djof

ASKER

Looks like I've lost you, so I'll accept your last comment