Solved

Socket Performance

Posted on 2002-04-09
13
455 Views
Last Modified: 2013-11-20
Hi Guys,

I have written a Socket class, basically a wrapper class for WSASocket functions. I used this class to replace a Named Pipe communications over a network, because the named pipe did not want to route all too well over the network.

The sockets (obviously) does not have the same problem. However, when the client and server is on the same machine, I get a throughput rate of about 39MB/s (megabytes) using the sockets, with Named Pipes I was able to achieve about 400MB/s on the local machine.
Furthermore the processor hits 100% usage using the sockets and the 39MB/s transfer rate (processor usage was about 90% on Named Pipes)

I would like to know whether this "phenomenon" is correct, or should I have been able to achieve either a higher throughput or a lower processor usage?

Regards
OD
0
Comment
Question by:OD
13 Comments
 
LVL 32

Accepted Solution

by:
jhance earned 50 total points
Comment Utility
I think you are confusing network protocols with network services.  Named Pipes are a network SERVICE that is implemented by some underlying network transport (i.e. network protocol).  Named pipes, however, breaks down to a shared memory scenario on a local machine so it's not surprising that it would be much faster than a sockets implementation which ALWAYS goes through the TCPIP protocol when the link is intra-workstation or inter-workstation.

What I don't understand is why you get 100% CPU utilization when using sockets.  Unless you have a GigaBit network path a network card simply cannot generate the throughput to saturate your CPU.  I suspect you've programmed this in a very inefficient way, perhaps using POLLING or something.  It might be helpful to see your code here...
0
 

Author Comment

by:OD
Comment Utility
jhance:

When testing the socket class I wrote a small server and client that were running on the same machine. (I may be wrong but I believe that the network card is now skipped in this scenario).
Anyway, I want to test the maximum through put that the socket class can achieve, so I write as fast as possible on the one side and receive the sent data as fast as possible on the other side.

This eventually translates to a troughput of about 39MB/s, with the processor usage at a constant 100% for as long as the test was conducted.

The throughput seemed a bit low for running one one machine, what do you think?

(I can post the code but it is actually a whole lot of code, because I implemented a layer above the socket to automatically pack together packets that were split into multiple packets by the network)
0
 
LVL 32

Expert Comment

by:jhance
Comment Utility
The network card is skipped (in the case where you are using the loopback address, 127.0.0.1) but if you are using the nework card's IP you end up going through the network card's drivers.  But in EITHER CASE you use the TCPIP protocol stack.

39MB/s (is that BYTES or BITS???)

Again, so much depends on how you have coded this.  I still suspect you are doing things poorly and this is what is causing the 100% CPU utilization.

Show some critical parts of your code!
0
 

Author Comment

by:OD
Comment Utility
I am doing a lookup during the connect to get to the IP address of the specified Server, which then resolves to 127.0.0.1. So at least the network card drivers are skipped.

The throughtput is 39 MegaBytes / Second.

There are three classes applicable here:
CTcpIpSocket which is the base class from which CServerSocket and CDataSocket are derived.

Here is the code:
#define TCPIPSOCKET_EVENTMESSAGE                        WM_USER + 100

#define TCPIPSOCKET_PREAMBLE                              3142587
#define TCPIPSOCKET_POSTAMBLE                              7852413

#define TCPIPSOCKET_PACKETSIZE                        131072

#define TCPIPSOCKET_LOOPTIME                              50

#define TCPIPSOCKET_CONNECTWAIT                        3333
#define TCPIPSOCKET_DISCONNECTWAIT                  2222

#define TCPIPSOCKET_PRIORITY                              THREAD_PRIORITY_BELOW_NORMAL


bool CTcpIpSocket::Open(unsigned short nPort, int nBufferSize, bool bDatagram)
{
      Close();

      m_bDatagram = bDatagram;

      int nType = (bDatagram) ? SOCK_DGRAM : SOCK_STREAM;
      int nProtocol = (bDatagram) ? IPPROTO_UDP : IPPROTO_TCP;

      m_hSocket = WSASocket(AF_INET, nType, nProtocol, NULL, 0, WSA_FLAG_OVERLAPPED);

      if (m_hSocket != INVALID_SOCKET)
      {
            m_nPort = nPort;

            if(SelectEvents())
            {
                  SetOptions(nBufferSize);
                  OnOpen();

                  return true;
            }
      }

      TRACE("\nSOCKET ERROR : %d", WSAGetLastError());

      return false;
}

void CTcpIpSocket::SetOptions(int nBufferSize)
{
      BOOL bEnable = TRUE, bDisable = FALSE;

      setsockopt(m_hSocket, SOL_SOCKET, SO_BROADCAST, (char*)&bEnable, sizeof(BOOL));
      setsockopt(m_hSocket, SOL_SOCKET, SO_DEBUG, (char*)&bDisable, sizeof(BOOL));
      setsockopt(m_hSocket, SOL_SOCKET, SO_DONTLINGER, (char*)&bEnable, sizeof(BOOL));
      setsockopt(m_hSocket, SOL_SOCKET, SO_KEEPALIVE, (char*)&bEnable, sizeof(BOOL));

      int nLength = sizeof(int),
             nSndRcvBuffer = min(TCPIPSOCKET_PACKETSIZE, nBufferSize);

      setsockopt(m_hSocket, SOL_SOCKET, SO_RCVBUF, (char*)&nSndRcvBuffer, nLength);
      setsockopt(m_hSocket, SOL_SOCKET, SO_SNDBUF, (char*)&nSndRcvBuffer, nLength);

      getsockopt(m_hSocket, SOL_SOCKET, SO_SNDBUF, (char*)&nSndRcvBuffer, &nLength);

      m_lMaxMessageSize = nSndRcvBuffer;
}

void CTcpIpSocket::ProcessEvents()
{
      WSANETWORKEVENTS sEvents = {0};
      if (!WSAEnumNetworkEvents(m_hSocket, m_hEvents, &sEvents))
      {
            if (sEvents.iErrorCode[FD_READ_BIT])
                  ProcessErrors(sEvents.iErrorCode[FD_READ_BIT]);

            if (sEvents.iErrorCode[FD_WRITE_BIT])
                  ProcessErrors(sEvents.iErrorCode[FD_WRITE_BIT]);

            if (!sEvents.lNetworkEvents)
                  return;

            // FD_READ | FD_WRITE | FD_OOB | FD_ACCEPT | FD_CONNECT | FD_CLOSE
            int nEvents = sEvents.lNetworkEvents;

            if (nEvents & FD_READ)
            {
                  OnRead();
            }

            if (nEvents & FD_WRITE)
            {
                  m_bConnected = true;
                  OnWrite();
            }

            if (nEvents & FD_OOB)
            {
                  TRACE("\nSOCKET OOB EVENT");
                  OnOob();
            }

            if (nEvents & FD_ACCEPT)
            {
                  TRACE("\nSOCKET ACCEPT EVENT");
                  OnAccept();
            }

            if (nEvents & FD_CONNECT)
            {
                  TRACE("\nSOCKET CONNECT EVENT");
                  m_bConnected = true;
                  OnConnect();
            }

            if (nEvents & FD_CLOSE)
            {
                  TRACE("\nSOCKET CLOSE EVENT");
                  OnClose();
            }

            return;
      }
      
      ProcessErrors();
}

Now for a server Socket:
bool CServerSocket::Listen()
{
      SOCKADDR_IN sAddress = {0};
      sAddress.sin_family = AF_INET;
      sAddress.sin_port = htons(m_nPort);
      sAddress.sin_addr.S_un.S_addr = ADDR_ANY;

      if (!bind(m_hSocket, (SOCKADDR*)&sAddress, sizeof(sAddress)))
      {
            if (!listen(m_hSocket, SOMAXCONN))
                  return true;
      }

      TRACE("\nSOCKET ERROR : %d", WSAGetLastError());

      return false;
}

void CServerSocket::OnOpen()
{
      m_bCloseEventThread = true;
      WaitForEventThread();
      m_pEventThread = NULL;

      Listen();
}

void CServerSocket::OnAccept()
{
      if (m_bDatagram)
            return;

      Purge();
      
      SOCKADDR_IN sAddress = {0};
      int nLength = sizeof(sAddress);
      
      SOCKET hClient = accept(m_hSocket, (SOCKADDR*)&sAddress, &nLength);

      if (hClient != INVALID_SOCKET)
      {
            CDataSocket *pSocket = new CDataSocket;

            if (pSocket)
            {
                  pSocket->Attach(hClient);
                  pSocket->SetEventWnd(m_hEventWnd);

                  m_cConnections.Add(pSocket);

                  return;
            }

            closesocket(hClient);
      }

      TRACE("\nSOCKET ERROR : %d", WSAGetLastError());
}

UINT CServerSocket::EventThread(LPVOID pParam)
{
      CServerSocket *pSocket = (CServerSocket*)pParam;

      pSocket->m_bCloseEventThread = false;

      for (;;)
      {
            if (pSocket->m_bCloseEventThread)
                  break;

            if (!pSocket->ProcessEvents())
                  Sleep(TCPIPSOCKET_LOOPTIME);
      }

      return 0;
}

bool CServerSocket::ProcessEvents()
{
      if (m_hSocket==INVALID_SOCKET)
            return false;

      bool bProcessed = false;

      if (EventsOccured())
      {
            CTcpIpSocket::ProcessEvents();
            bProcessed = true;
      }

      int nConnectionCount = GetClientCount();

      if (nConnectionCount > 0)
            Purge();

      for (int nI=1; nI<=nConnectionCount; nI++)
      {
            CDataSocket *pSocket = GetClient(nI);
            if (pSocket && pSocket->EventsOccured())
            {      
                  pSocket->ProcessEvents();
                  bProcessed = true;
            }
      }

      return bProcessed;
}

Now for a client socket:
bool CDataSocket::Connect(CString strIpAddress, int nPort)
{
      m_bConnected = false;

      SOCKADDR_IN sAddress = {0};
      sAddress.sin_family = AF_INET;
      sAddress.sin_port = htons(nPort);

      if (strIpAddress.Find('.') == -1)
      {
            // Try to translate this address to a ip address
            HOSTENT *pHost = gethostbyname(strIpAddress);
            if (pHost)
                  strIpAddress.Format("%u.%u.%u.%u", (unsigned char)pHost->h_addr_list[0][0],
                                                                     (unsigned char)pHost->h_addr_list[0][1],
                                                                     (unsigned char)pHost->h_addr_list[0][2],
                                                                     (unsigned char)pHost->h_addr_list[0][3]);
      }
      sAddress.sin_addr.S_un.S_addr = inet_addr(strIpAddress);

      if (!connect(m_hSocket, (SOCKADDR*)&sAddress, sizeof(sAddress)))
            return true;

      int nError = WSAGetLastError();
      if (nError == WSAEWOULDBLOCK)
      {
            return WaitForConnect();;
      } else {
            ProcessErrors(nError);
      }

      return false;
}

void CDataSocket::OnRead()
{
      unsigned long ulBytes = 0;

      ioctlsocket(m_hSocket, FIONREAD, &ulBytes);

      if (ulBytes > 0)
      {
            Receive(ulBytes);
      }
}

bool CDataSocket::Write(void *pData, long nSize, bool bAbortOnBlock)
{
      if (m_hSocket == INVALID_SOCKET)
            return false;

      return WriteFragmented(pData, nSize, bAbortOnBlock);
}

bool CDataSocket::WriteFragmented(void *pData, long lMessageSize, bool bAbortOnBlock)
{
      int nError = 0;
      bool bSuccess = true;

      /*
            Write the header and the first data packet
      */
      unsigned char *pWriteBuffer = new unsigned char[m_lHeaderSize + lMessageSize + 1];
      if (pWriteBuffer)
      {
            m_sHeader.lMessageSize = lMessageSize;

            memcpy(pWriteBuffer, &m_sHeader, m_lHeaderSize);
            memcpy(&pWriteBuffer[m_lHeaderSize], pData, lMessageSize);

            int nBytesWritten = 0,
                   nBytesWrittenTotal = 0,
                   nTotalMessageSize = lMessageSize + m_lHeaderSize,
                   nBytesToWrite = min(m_lMaxMessageSize, nTotalMessageSize);

            do
            {
                  nBytesWritten = send(m_hSocket, (char*)&pWriteBuffer[nBytesWrittenTotal], nBytesToWrite, 0);

                  if (nBytesWritten == SOCKET_ERROR)
                  {
                        nError = WSAGetLastError();

                        if (nError == WSAEWOULDBLOCK)
                        {
                              /*
                                    Only abort if nothing has been written to the socket. Once anything
                                    of the message has been written, complete the write process,
                                    even if it will block
                              */
                              if (bAbortOnBlock && nBytesWrittenTotal==0)
                              {
                                    TRACE("\nDATASOCKET WOULD BLOCK");
                                    break;
                              }

                              Sleep(TCPIPSOCKET_LOOPTIME);

                              continue;
                        } else {
                              /*
                                    Now we have a problem, the header and some data is sent,
                                    but we can't write to the socket, WHAT NOW?
                              */
                              ProcessErrors(nError);

                              bSuccess = false;
                              break;
                        }
                  } else {
                        nBytesWrittenTotal += nBytesWritten;
                        nBytesToWrite = min(m_lMaxMessageSize, nTotalMessageSize-nBytesWrittenTotal);
                  }
            } while (nBytesWrittenTotal < nTotalMessageSize);

            delete [] pWriteBuffer;
      }

      return bSuccess;
}

bool CDataSocket::Receive(unsigned long ulBytes)
{
      bool bSuccess = false;

      static long lMessageBytes = 0, lMessagePosition = 0;

      unsigned char *pData = new unsigned char[ulBytes+1];
      if (pData)
      {
            int nReceived = recv(m_hSocket, (char*)pData, ulBytes, 0);

            if (nReceived==SOCKET_ERROR)
            {
                  TRACE("\nSOCKET ERROR : %d", WSAGetLastError());                  
            } else {
                  if (nReceived > 0)
                  {
                        if (AppendBuffer(pData, nReceived))
                        {
                              if (m_lMessageSize < 1 || lMessagePosition < 1)
                              {
                                    lMessagePosition = 0;

                                    long lHeaderPos = FindHeader();
                                    if (lHeaderPos > 0)
                                    {
                                          sMessageHeader *pHeader = (sMessageHeader*)&m_pDataBuffer[lHeaderPos-1];
                                          m_lMessageSize = pHeader->lMessageSize;

                                          if (m_lMessageSize > 0)
                                                lMessagePosition = (lHeaderPos-1) + m_lHeaderSize;
                                    }
                              }
                        }

                        while (m_lDataSize-lMessagePosition>=m_lMessageSize && lMessagePosition>0)
                        {
                              SaveMessage(lMessagePosition);
                              TrimBuffer(lMessagePosition+m_lMessageSize);

                              if (m_hEventWnd)
                                    PostMessage(m_hEventWnd, TCPIPSOCKET_EVENTMESSAGE, NULL, (long)this);

                              long lHeaderPos = FindHeader();
                              if (lHeaderPos > 0)
                              {
                                    sMessageHeader *pHeader = (sMessageHeader*)&m_pDataBuffer[lHeaderPos-1];
                                    m_lMessageSize = pHeader->lMessageSize;

                                    if (m_lMessageSize > 0)
                                          lMessagePosition = (lHeaderPos-1) + m_lHeaderSize;
                              } else {
                                    lMessagePosition = 0;
                                    m_lMessageSize = 0;
                              }
                        }

                        bSuccess = true;
                  }
            }
      }

      if (pData)
            delete [] pData;
      
      if (!bSuccess)
            recv(m_hSocket, NULL, 0, 0);
      
      return bSuccess;
}

You will notice that the WriteFragmented function implements the layer that I mentioned previously. It sends a header message that contains the actual size of the data to follow. It then sends the data in chunks of (128k). At the receiving end the data is then packed together.
0
 

Author Comment

by:OD
Comment Utility
I have removed the whole layer to test its performance penalty and the throughput improved to 41 MB/s. So for what I am trying to achieve with the layer, the 2MB/s penalty is acceptable.

OD
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 32

Expert Comment

by:jhance
Comment Utility
WEll, I'm not sure what performance is theoretically possible via TCPIP sans a network connection but 39MB/s may be approaching it.  That's more than 300 Mbits/sec and is far faster than it needs to be to support either 10BaseT or 100BaseT network connections.  I suppose it's slow for GigaBit Ethernet networks but there may be other factors involved here.

Have you looked into things like MTU and window sizes?  I think you are pressing the limits of what TCPIP is capable of and in these cases tuning may be required to extract the full potential of the system.
0
 
LVL 10

Expert Comment

by:makerp
Comment Utility
i have always noticed that when using sockets where both client and server are on the same machine the cpu always goes flat out during transfers. this is because both sides of the protocol are executing on the same machine. also i think that from an OS point of view there is probably a special case for when client and server are on the same machine regardless of ip address used to establish comms, this special case being bypasing the network card completly.

i agree with jhance on the performance thing, TCP is a heavy protocol. if you tried using UDP then you will probably see a dramatic speed up. although you will have to deal with packet loss and ordereing etc.

Paul  
0
 

Author Comment

by:OD
Comment Utility
The question now is how the hell are we going to achieve 1 gigabit performance from any socket classes if the limit is reached somewhere in the 40MB/s vicinity vs. the theoretical 125MB/s gigabit performance ?
0
 
LVL 32

Expert Comment

by:jhance
Comment Utility
That was not your question....

Have you tried a GigaBit LAN card?  I would hope that it would come with drivers which are capable of 1Gb/s performance...

You gave no indication that you were using such a device.
0
 

Expert Comment

by:ahmadrazakhan
Comment Utility
Throughput of TCP connection is decreased due to error WSAWOULDBLOCK, this error is often there caz of small buffers. Avoiding this error can enhance data throughput. Say u have to send 4MegaBytes of data, keep ur Send and Recieve buffers 10 times bigger, 40 MB, this can be set using SetsockOpt, with SO_SNDBUF,SO_RCVBUF. See the results. conduct the same test using default buffers and compare the difference.
Also if u have to send data on same machine u can bypass TCP stack, it is possible by Winsock direct. For more info about it u have to consult Microsoft.
0
 

Author Comment

by:OD
Comment Utility
jhance:
Soory about the confusion, my question was from a theoretical point of view...

ahmadrazakhan:
I have tried to gather as much information regarding this topic as possible. It acutally seems that the send and receive buffer sizes should be set according to the round trip times for packets on a network (Typically the times reported by PING).

My initial scenario I set the send and receive buffers to 25 * expected message sizes. Using the round trip calculation I actually came to the conclusion that my send/receive buffers should be in the region of about 128KB on a 100MB lan.

I implemented the new smaller buffers and it actually did make a marginal improvement in the throughput. I am now just above 40MB/s throughput including all my overhead
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
cat dog challenge 18 122
Turn a spreadsheet into a vba executable. 2 66
maxBlock challenge 30 99
method notes when mouse over in eclipse 5 65
Introduction: Ownerdraw of the grid button.  A singleton class implentation and usage. Continuing from the fifth article about sudoku.   Open the project in visual studio. Go to the class view – CGridButton should be visible as a class.  R…
Introduction: The undo support, implementing a stack. Continuing from the eigth article about sudoku.   We need a mechanism to keep track of the digits entered so as to implement an undo mechanism.  This should be a ‘Last In First Out’ collec…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now