Solved

getting errno=4, interupted system call, also get a EPIPE signal after runs for a few hours

Posted on 2004-09-12
3
933 Views
Last Modified: 2013-12-26
can someone please help me on a few things.

1.   When this code runs, i send data from client to this server code.  I successfully send and get back data. But, as soon as the signal/alarm fire off once (right after the read_request), i get a errno=4, interupted system call, then when i try to write on the socket again i get "bad file number" on socket.     If i get rid of the signal/alarm and go into an infinite loop after the read_request (thus never returning back to main), the writes work.  Is it because i am closing the socket when i return, i've tried commenting that out and get same issue.   Is the errno=4, interupted system call happening because of the signal/alarm interupt?  

I'm trying to control the checkforupdates function by using the timer. When i receive any new requests on the socket, i want to turn it off, that is why see update_timer(0), update_timer(1) when go into read_request again.

2. When i run this code for a few hours, i end up getting EPIPE recieved signal.  In my code i close socket if i get this and wait for another request.  What causes this signal to occur, and should i handle it different than closing socket.   When this occurs, i can tell the client to resume, and it takes off again, but that's not too robust for something that needs to run 24/7.   I think i will write code on the client, that if i don't get back data for x amount of seconds, i will re-request data.  I have a piece of data that changes every second to be used as a watchdog, so it should always be changing.

3. How do i know that the client socket has closed?   Should i look for errors on the "write" statements to determine this.  I want to stop looking and sending updates and go into wait mode if i close client.

Lots of questions, sorry for so long a post.

Bob








#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include "/wdpf/rel/ssw/shc/inc/SHC_err.h"
#include "/wdpf/rel/ssw/shc/inc/spd.h"
#include "pointServer.h"
#include <signal.h>
#include <unistd.h>


#define TRUE 1
#define FALSE 0
#define anaDB  .01
#define MAXPOINTS 8000
#define RECBUFFSIZE  400000
#define SENDBUFFSIZE 400000
#define MAXRECORDSENDS 300

/* signal stuff */
void update_timer(int);
void cleanup();

#define SCANTIME  5


char debug_msg = FALSE;
int PORT;

void processRead(void);

typedef struct datarec {
       char pn[9];
    int rt;
    unsigned short digital_val;
    unsigned short old_digital_val;
      unsigned short digital_stat;
      unsigned short old_digital_stat;
      unsigned short gp_val;
      unsigned short old_gp_val;
      unsigned short gp_stat;
      unsigned short old_gp_stat;
      unsigned short gp_force_stat;
      unsigned short old_gp_force_stat;
      float av;
    float old_av;      
    unsigned short as;
      unsigned short old_as;
    long sid; } DATAREC;



/* struct datarec *indatap; */
struct datarec indatap[MAXPOINTS];


int numpoints;

FILE *data_lun;
int access_type = RUNTIME;
static char *wdpf_pdir;
long gp_sid = 0;
char gp_bit_num = 0;
char ina_flag;
char ext_flag;
char outbuff[SENDBUFFSIZE];   /* need to dynamically do this! later*/
char inbuff[RECBUFFSIZE];
int file_desc;
int status;
int msgsock;
Point_entry point_info;
char network_name[8];
unsigned char byte_val;
int optlen;
long sockstat;

main(int argc, char *argv[])
{

int sock,clientLen,serverLen;
struct sockaddr_in serverAddr, clientAddr;
int i;
char on=1;
int tcpsendbuff, tcprecbuff;
char ch;

if(debug_msg)
printf("\nprogram started...");


/** get program command flags **/
/** first argument is port number **/
if(argc < 2) {
  printf("\nincorrect calling parameters;  rt portID <d>  <d> = debug mode");
  exit(0); }

/** see if want to go into debug mode **/
while(--argc >0)
 if(strchr(argv[argc],'d')) {
   debug_msg = TRUE;
   printf("\nentering debug message mode\n"); }

/** turn off update interupt **/
update_timer(0);


/** get port number from command line **/
PORT = strtol(argv[1],NULL,10);

/** set up WDPF stuff **/
wdpf_pdir = (char *) getenv("WDPF_PDIR");
if(wdpf_pdir == NULL) {
  printf("\nunable to get WDPF_PDIR environment variable");
  exit(1); }

status = SHC_open_memory();
if(status != SHC_OK) {
  printf("\nSHC_open_memory error -- status = %d",status);
  exit(1); }

file_desc = SPD_open_file(wdpf_pdir,&access_type);
if(file_desc < 0) {
  printf("\nSPD_open_file error -- %s status = %d",wdpf_pdir,file_desc);
  exit(1); }



/******************** set up socket interface with XP  ***********************/


/** setup stream socket **/
if((sock = socket(AF_INET,SOCK_STREAM,0)) < 0) {
    printf("\ncannot open stream socket");
    exit(1); }

if(debug_msg)
  printf("\ncreated sock, sock = %d..",sock);

/** set socket options **/
if(setsockopt(sock,SOL_SOCKET,SO_REUSEADDR,&on,sizeof(on)) < 0) {
    printf("\ncannot set SO_REUSEADDR socket option");
   /* exit(1);*/ }
   

tcpsendbuff = 64384;
tcprecbuff = 64384;


if(setsockopt(sock,SOL_SOCKET,SO_SNDBUF, (char *) &tcpsendbuff, sizeof(tcpsendbuff)) < 0) {
    printf("\ncannot set SO_SNDBUF socket option");
    exit(1); }
 
if(setsockopt(sock,SOL_SOCKET,SO_RCVBUF, (char *) &tcprecbuff, sizeof(tcprecbuff)) < 0) {
    printf("\ncannot set SO_RCVBUF socket option");
    exit(1); }

optlen =sizeof(tcpsendbuff);
if(getsockopt(sock,SOL_SOCKET,SO_SNDBUF, (char *) &tcpsendbuff, &optlen) < 0) {
    printf("\ncannot set SO_SNDBUF socket option");
    exit(1); }
if(debug_msg)
 printf("\nnew SO_SNDBUF len = %d",tcpsendbuff);


optlen =sizeof(tcprecbuff);
if(getsockopt(sock,SOL_SOCKET,SO_RCVBUF, (char *) &tcprecbuff, &optlen) < 0) {
    printf("\ncannot set SO_RCVBUF socket option");
    exit(1); }
if(debug_msg)
 printf("\nnew SO_RCVBUF len = %d",tcprecbuff);


serverAddr.sin_family = AF_INET;
serverAddr.sin_addr.s_addr = htonl(INADDR_ANY);
serverAddr.sin_port=htons(PORT);

if(bind(sock,(struct sockaddr *) &serverAddr,sizeof(serverAddr)) < 0) {
   printf("\ncannot bind stream socket error, port = %d",PORT);
   exit(1); }

if(debug_msg)
  printf("\nbind socket ok...");

serverLen = sizeof(serverAddr);

if(getsockname(sock,(struct sockaddr *) &serverAddr,&serverLen) <0) {
   printf("getting socket name");
   exit(1); }

if(debug_msg)
  printf("\ngetsocketname ok...");

listen(sock,1);

if(debug_msg)
  printf("\nlistening for requests on socket %d...",PORT);

/* turn off alarming up front  */
signal(SIGINT, cleanup);
signal(SIGPIPE, SIG_IGN);



for(;;) {

do {
  clientLen= sizeof(clientAddr);
  msgsock = accept(sock, (struct sockaddr *) &clientAddr, &clientLen);
if(debug_msg)  
printf("\nmsgsock=%d errno = %d errstr = %s",msgsock,errno,strerror(errno));
}while (msgsock <0 && (errno == EINTR || errno == 32) );



if(msgsock < 0)
 if(debug_msg) {
    printf("\nsock accept error");
    exit (-1); }
 

if(debug_msg)
  printf("\naccepted socket...");
i=readn(msgsock,inbuff, sizeof(inbuff));

if(debug_msg) {
 printf("\nreceived %d bytes",i);
 printf("\nstring = %s",inbuff); }


read_request();
close(msgsock);


}/* end for loop */

if(debug_msg)
printf("\nexiting do loop errno = %s",strerror(errno));
sockstat = close(msgsock);
exit(0);



}

/*************************** end main **********************************/



/********************************************************************************************************/
/** this function  used to read "n" bytes from socket **/
int
readn(fd,ptr,nbytes)
register int fd;
register char *ptr;
register int nbytes;
{
int nleft,nread;
char terminator[2];

nleft = nbytes;



do {

nread = read(fd,ptr,nleft);
strncpy(terminator,ptr+(strlen(ptr)-2),2);

 
if(nread < 0)
   return(nread);
else if (nread == 0)
   break;

nleft -= nread;
ptr   += nread;
/* *ptr ='\0'; */


if(debug_msg)
 printf("\nterminator = %s", ptr+(strlen(ptr)-2));


if(!strcmp(terminator,"^^"))
{
  break;
}

} while(1);

return(nread);


}


/********************************************************************************************************/
/* used to write "n" bytes */
int
writen(fd,ptr,nbytes)
register int fd;
register char *ptr;
register int nbytes;
{

int nleft, nwritten;

nleft = nbytes;

while(nleft >0) {
 
nwritten = write(fd,ptr,nleft);
 
if(nwritten <=0)
    return(nwritten);
  nleft -= nwritten;
 ptr   += nwritten;
}

return(nbytes - nleft);
}
/********************************************************************************************************/




/********** read inputs points ******/
read_request(void)
{

int i,j;
char cmdType[3];
int cmdNum;
int status;
int count;
char strcount[5];
char field_name[2];


strncpy(cmdType,inbuff+0,2);
cmdNum = atoi(cmdType);
cmdType[2]='\0';



/* whole point records */
if(cmdNum==0) {


update_timer(0);
 
strncpy(strcount,inbuff+2,4);
strcount[4]='\0';
count = atoi(strcount);
numpoints = count;


if(debug_msg)
 printf("\nrequested to retreive %d points",count);

/* allocate memory based on number of points found */
/*indatap = malloc(count * sizeof(datarec)); */
if(debug_msg)
  printf("\nallocated %d memory structures for points",count);


j=6;
for (i = 1;i <=count; i++)
 {
strncpy(indatap[i].pn, inbuff+j,8);
j=j+8;
indatap[i].pn[8]='\0';

/* need to add error checking here later*/
status = SPD_get_sid(&file_desc,indatap[i].pn,&indatap[i].sid,&gp_sid,&gp_bit_num,&ext_flag,&ina_flag);


status = SPD_get_point_info(&file_desc,&indatap[i].sid,&point_info,network_name);
indatap[i].rt = point_info.rec_type;

if(debug_msg)
  printf("\nrt = %d",indatap[i].rt);

if(debug_msg)
  printf("\npopulated dynamic structure for %s (index = %d) stat = %d",indatap[i].pn,i,status);
}

/* go and get whole point records if cmd type = 00*/

getpointrecords();

update_timer(1);

} /* end whole point record */


if(cmdNum== 2) {

update_timer(0);

strcpy(field_name,"CM");
byte_val = 14;

printf("\nnum points = %d",numpoints);
for(i=1;i<=numpoints;i++)  {
status = SHC_change_byte_attribute(&indatap[i].sid,&field_name, &byte_val);
if(debug_msg)
  printf("\nwrote ack bit for pn = %s, stat = %d", indatap[i].pn,status);
 }

update_timer(1);
} /* end write ack bit */



}


int getpointrecords()
{

int i;
int k;
unsigned char point_record[300];
char hh[2];
char hdrStr[6];
int status;
int bytesSent;
long retMsgCount;
char tempbuff[300];
int sendCounter;

/* build ret msg string in format of [cmd][index]data^^  */  
outbuff[0]='\0';
hdrStr[0]='\0';
sendCounter = 0;

/* cmdtype = 00 for whole point record ret */
sprintf(hdrStr,"%2s","00");  
hdrStr[2]='\0';
strcpy(outbuff,hdrStr);

if(debug_msg)
printf("\nnumpoints = %d",numpoints);

for (i = 1;i <= numpoints; i++)

 {

   strcat(outbuff,"(");
   sprintf(hdrStr,"%04d",i);  
   hdrStr[4]='\0';
   strcat(outbuff,hdrStr);

   tempbuff[0]='\0';
   status = SHC_get_point_record(&indatap[i].sid, &point_record);
   if(debug_msg)
    printf("\nretrieved record for %s (index = %d)",indatap[i].pn,i);

    for(k=0;k<128;k++) {
     sprintf(hh,"%02x",point_record[k]);
     strcat(tempbuff,hh);}
   
   strcat(outbuff,tempbuff);
   strcat(outbuff,")");

   if(sendCounter >= MAXRECORDSENDS) {
       outbuff[strlen(outbuff)] ='\0';
       bytesSent=writen(msgsock,(char *)outbuff,strlen(outbuff));
      if(debug_msg  )
         printf("\ngoing to send back %d bytes",strlen(outbuff));
          /*  printf("\nstring = \n%s",outbuff);  */
           outbuff[0]='\0';
         sendCounter = 0; }
   else
         sendCounter = sendCounter +1;
   
}  

/* tag ending message info */
    strcat(outbuff,"^^");
    retMsgCount = strlen(outbuff);
    outbuff[retMsgCount]='\0';

   /* send back the localbuff up to retMsgCount of len */
   if(debug_msg  ) {
     printf("\ngoing to send back %d bytes",retMsgCount);
     /* printf("\nstring = \n%s",outbuff); */ }

   bytesSent=writen(msgsock,(char *)outbuff,retMsgCount);
   if(debug_msg)
     printf("\nwriten errno = %d, %s sent %d bytes over network",errno,strerror(errno),bytesSent);


init_struct();


}


checkforchanges()
{
int i;
char *modestr;
int status;
int bytesSent;
char tmpstr[20];


/* exceptions sent as CMD(PN_TOKEN=VALUE), ... */
/* clear out buffer */


/* don't let another timer hit if already in here */

/* while(1) {    */


outbuff[0]='\0';
strcpy(outbuff,"01");

for (i=1;i<=numpoints;i++)
  {

modestr="";

  switch(indatap[i].rt) {

  case RECORD_TYPE_AI:
  case RECORD_TYPE_AL:
  case RECORD_TYPE_AC:
  case RECORD_TYPE_AM:
  modestr="Analog";
  status = SHC_get_analog_val_stat(&indatap[i].sid,&indatap[i].av,&indatap[i].as);



if(indatap[i].old_as != indatap[i].as) {
      strcat(outbuff,"(");
      strcat(outbuff,indatap[i].pn);
      strcat(outbuff,"_AS=");
        sprintf(tmpstr,"%d",indatap[i].as);
        strcat(outbuff,tmpstr);
      strcat(outbuff,")");

 }


if(debug_msg &&  indatap[i].old_as != indatap[i].as)  
     printf("\nold as of %s = %d, new as val = %d",indatap[i].pn, indatap[i].old_as,indatap[i].as);

indatap[i].old_as = indatap[i].as;

  break;

  case RECORD_TYPE_DI:
  case RECORD_TYPE_DL:
  case RECORD_TYPE_DC:
  case RECORD_TYPE_DM:
  modestr="Digital";
  status = SHC_get_digital_val_stat(&indatap[i].sid,&gp_sid,&gp_bit_num,&indatap[i].digital_val,&indatap[i].digital_stat);


/* for testing only */
indatap[1].old_digital_stat = 99;

if(indatap[i].digital_stat != indatap[i].old_digital_stat && status == SHC_OK) {

      strcat(outbuff,"(");
       strcat(outbuff,indatap[i].pn);
      strcat(outbuff,"_DS=");
        sprintf(tmpstr,"%d",indatap[i].digital_stat & 0x01);
        strcat(outbuff,tmpstr);
      strcat(outbuff,")");

}


if(debug_msg && indatap[i].digital_stat != indatap[i].old_digital_stat)
          printf("\nold ds of %s = %d, new ds = %d",indatap[i].pn, indatap[i].old_digital_stat,indatap[i].digital_stat);

      
indatap[i].old_digital_stat = indatap[i].digital_stat;

 
break;


  break; }  /* end switch */


 }   /* next point, end for*/

 strcat(outbuff,"^^");
 outbuff[strlen(outbuff)]='\0';

/* only send if string is greater than the default 00^^ */
if(strlen(outbuff) > 4) {
 bytesSent=writen(msgsock,(char *)outbuff,strlen(outbuff));
   if(debug_msg)
     printf("\nwriten errno = %d, %s, sent %d bytes over network",errno,strerror(errno),bytesSent);
   }

if(errno == 32) {
if(debug_msg)  
  printf("\nreceived EPIPE signal, closing socket, waiting...");
 close(msgsock);
 errno=0;
 return; }



/*  sleep(SCANTIME); */

/*     }     while true */

}


 
void update_timer(int signum)
{
if(signum == 0)
{ alarm(0);
 return;
}

if(debug_msg)
  printf("\nchecking for exceptions");
checkforchanges();
signal(SIGALRM,update_timer);
alarm(SCANTIME);  
}  

init_struct()
{

int i;

for(i=1;i<=numpoints;i++) {

        indatap[i].digital_val=0;
        indatap[i].old_digital_val=9;
      indatap[i].digital_stat=0;
      indatap[i].old_digital_stat=9;
      indatap[i].gp_val=0;
      indatap[i].old_gp_val=9;
      indatap[i].gp_stat=0;
      indatap[i].old_gp_stat=9;
      indatap[i].gp_force_stat=0;
      indatap[i].old_gp_force_stat=9;
      indatap[i].av=0.00;
        indatap[i].old_av=99999.00;      

}
}



void cleanup(int sig_num)
{
 signal(SIGINT,cleanup);
 printf("\nexit gracefully\n\n\n...");
 fflush(stdout);
 sockstat = close(msgsock);  
exit(0);
}
0
Comment
Question by:rkneal
3 Comments
 
LVL 51

Expert Comment

by:ahoffmann
ID: 12039689
signal(SIGINT,cleanup);
in your cleanup() function should be the last command

Also, why do you reinstall the signal handler when you exit anyway?
0
 
LVL 45

Accepted Solution

by:
sunnycoder earned 500 total points
ID: 12042425
First, I would recommend using sigaction interface rather than signal.

>But, as soon as the signal/alarm fire off once (right after the read_request), i get a errno=4, interupted system call,
When you are writing/reading from socket (or blocked for one of these events), a signal was abort the call and transfer the control to signal handler ... hence the error ... The remedy for this is

num = recv(..);

if ( num < 0 && (errno == EINT || errno == EAGAIN) )
      go back to recv ... you can use a do while loop if you wish to avoid a goto

> then when i try to write on the socket again i get "bad file number" on socket.
On SIGINT, you call claenup which closes the socket and hence the error. You have identified that correctly

> i've tried commenting that out and get same issue.
Check again ... This sounds highly improbable


>2. When i run this code for a few hours, i end up getting EPIPE recieved signal.  In my code i close socket if i get this and wait for
>another request.  What causes this signal to occur,
The other communicating party has closed its socket. In short, the communication is broken. It could be due to applicaiton crash or exit on the other machine or fault in the communication link,

>and should i handle it different than closing socket.  
No ... you should close the socket

>When this occurs, i can tell the client to resume, and it takes off again, but that's not too robust for something that needs to run
>24/7.  
It is ... You cannot control all kinds of failures including communication link failure ... Robustness comes from being able to deal with such errors and not trying to prevent them :)

>I think i will write code on the client, that if i don't get back data for x amount of seconds, i will re-request data.  I have a piece of
>data that changes every second to be used as a watchdog, so it should always be changing.
If you are using TCP sockets (SOCK_STREAM), then data is not lost ... TCP handles lost or corrupted data and retransmits it for you. I am not familiar with the nature of your application so I really cannot comment on that. But if server detects change in its data and tranmits it, then there should be no reason for client to request it


>3. How do i know that the client socket has closed?   Should i look for errors on the "write" statements to determine this.  I want to
>stop looking and sending updates and go into wait mode if i close client.
EPIPE :) .. all socket communication will fail
0
 
LVL 4

Expert Comment

by:pankajtiwary
ID: 12042558
How do you know that the connection has been closed?

If you are trying to read from a connection which has been closed, read() returns 0 or -1 (correct me if I am wrong), and if you are trying to write to a non-existing connection, you get an EPIPE error.

When an interrupt comes, and you have not handled it properly the default action happens which may be closing the socket and thats why you are getting the bad file descriptor.

Could not check the code since it is too big.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

This is to be the first in a series of articles demonstrating the development of a complete windows based application using the MFC classes.  I’ll try to keep each article focused on one (or a couple) of the tasks that one may meet.   Introductio…
Introduction: The undo support, implementing a stack. Continuing from the eigth article about sudoku.   We need a mechanism to keep track of the digits entered so as to implement an undo mechanism.  This should be a ‘Last In First Out’ collec…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now