Solved

Copying stdin to stdout with select system call

Posted on 1998-09-19
26
802 Views
Last Modified: 2013-12-26
An exercise in Stevens' "Unix Network Programming" says: "Write a program that copies standard input to standard output using the select system call". I've made a start on this: the code below waits for a user to enter something on standard input, and then delivers the "something" to standard output: but surely more is involved in this exercise? Any advice will be greatly appreciated.

David King.

#include <stdio.h>
#include <sys/types.h>
#include <sys/time.h>



main(int argc, char *argv[])
{

fd_set fdvar;
FD_ZERO(&fdvar);
FD_SET(0, &fdvar);

if (select(1, &fdvar, (fd_set *)0, (fd_set *)0, NULL) == 1)
printf("Input about to be copied to output!\n");
}


0
Comment
Question by:sevrin
  • 14
  • 11
26 Comments
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293290
To exploit all of the power of the 'select' call (which could be the goal of the exercise), I think you should use select both for reading stdin (meaning to wait for something to read) and for writing to stdout (meaning to wait for the output channel to be ready to accept output).
The way you are doing it will cause your  program to block in the output routine until it becomes ready.  Meanwhile, your input channel is 'blocked' because you are not going to select it until the write is done.  The whole purpose of 'select' is to avoid this kind of blocking or sequential handling of one input followed by one output call.
The stdin/stdout case is simple.  Your input routine will get all that is waiting in the input channel, and your output routine will (almost) never block or do partial writes.  However, replacing stdin and stdout by sockets and doing send/recv calls as I/O, the usefulness of 'select' becomes quickly apparent.  Typically, you will need to do multiple recv calls to get one complete 'message', handle the input and than do one or more send calls depending on the local and remote resources available at the time.  Imagine doing this on multiple input and output sockets at once in a single program, and you will start liking 'select'.
0
 
LVL 3

Expert Comment

by:dhm
ID: 1293291
Stevens may also have in mind using select() to construct a pipe between processes on different machines.  As bertvermeerbergen points out, you'd have to select() on both stdin and stdout to make sure you don't hang in either the read() *or* the write(), which would cause your whole pipe to break down.
0
 

Author Comment

by:sevrin
ID: 1293292
Thanks for these comments. Taking the simple option first - that is, not using sockets - are you suggesting one invocation of select or two? I actually tried changing the program to the following:

#include <stdio.h>
#include <sys/types.h>
#include <sys/time.h>

main(int argc, char *argv[])
    {

    fd_set fdvar;
    FD_ZERO(&fdvar);
    FD_SET(0, &fdvar);

    if (select(2, &fdvar, &fdvar, (fd_set *)0, NULL) == 2)
 
    printf("Input about to be copied to output!\n");
    }

But of course, this isn't the solution either: running the program just quickly gives me the shell prompt again, signifying that, yes, both STDIN and STDOUT *are* ready for I/O. And that's not really copying STDIN to STDOUT using select. So: could you suggest some vague program structure I should follow? That is, whether I really should write a program using (Unix domain) sockets, whether I should call select once or more than once, that sort of thing.

Thanks

David King.
0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293293
One of the problems is that the use of select in a simple case like  copying seems to lead to very complex programs for simple problems.  However, the use of sockets, pipes or 'normal' file descriptors isn't really important for the general structure of a program.  Local fd's tend to be more ready and deliver all the data that is ready to be read, or accept all the data you are writing, thereby hiding (but not excluding) situations that could (and will) occur with sockets.
What you could try is to:
1. Definitely use only a single select in your program.  Otherwise, the blocking problem will occur for more complex programs.
2. Use a seperate fd_set for every argument to select.  What you are doing is not correct: stdout is file descriptor '1', not '0'.  So create an fd_set for input descriptor checking and another for output descriptor checking, only setting the bits for the type of IO you are checking for a given file descriptor.
3. Use some sort of loop around the select.  You can handle all ready descriptors in an embedded loop after the select, scanning both input and output fd_sets for bits that are still on after the select (the return value is the sum of both).  Another approach is to handle only a single IO in the loop, the other ready descriptors will still be ready after the next select call (until you handle the IO that the file descriptor is ready for, or something breaks).  The fd_sets are changed by the select call, so they must be recreated every time in the loop.  You can use two extra fd_sets that you initialize once outside the loop and than copy these (memcpy) to the fd_sets that you are using as arguments just before calling select.  This is not very dynamic, and is not always a good idea for output file descriptors.  in general, an output descriptor should be checked only if you have something to write on it, or your program will loop around the select call, doing nothing useful.
4. You will need to buffer the data.  Depending on the requirements, this can be a simple character array, where you will accept one read IO on stdin and than 'unselect' stdin until stdout becomes ready AND you were able to write the complete buffer. Or better, implement some circular buffer so that you can select both read and write descriptors simultaneously.  When the buffer really fills up, hold the input (do not set the bit) until some writes have been done.  If the buffer is empty, do not select the output descriptor to avoid looping.
5. Have fun.
0
 

Author Comment

by:sevrin
ID: 1293294
This is helpful. Taking the steps one at a time: in step 2, I take it you mean something like this is required?


#include <stdio.h>
#include <sys/types.h>
#include <sys/time.h>

main(int argc, char *argv[])
    {

    fd_set fdvar;
    FD_ZERO(&fdvar);
    FD_SET(0, &fdvar);
    FD_SET(1, &ffddvar);      

    if (select(2, &fdvar, &ffddvar, (fd_set *)0, NULL) == 2)
 
    ...

    }

0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293295
Yes, and don't forget to declare and zero out the ffddvar (what's in a name, eh ?).
The if clause will have to go, and you need to test the return value to know how many fd's are ready in both the fd_sets.  What are you planning to do when the return is not == 2 ?  Do not suppose the select will not return until 2 fd's are ready, it will come back when at least one fd is ready in one or more fd_sets (or when timeout occurs, which won't when passing NULL).
Use something like the following structure:

set bits in fdReadSet;
set bits in fdWriteSet;

while (not done)
{
   memcpy fdReadSet to fdRead
   memcpy fdWriteSet to fdWrite

   nRdy = select(nHighBit, &fdRead, &fdWrite, 0, NULL);

   if (nRdy == 0)
   {
       timeout or interrupted (check errno value if yu want to handle this case)
   }

   for every bit set in fdWrite (lower that nHighbit)
   {
        do the output io (check result, could be partially ok)
        reset the bit in fdWriteSet if all available output is written
        nRdy--
   }

   for every bit set in fdRead (lower than nHighBit)
   {
        do the input io (check the result because I am paranoic)
        reset the bit in fdReadSet if the buffer for this fd is filling up
        (maybe) set a bit in the fdWriteSet to select the corresponding output channel for the data read
        nRdy--
   }

    assert (nRdy == 0)  (nice test for your program logic)

    calculate the 'done' criteria to stop the loop
}

This is only a general structure.  You can add also logic to handle the third possible fd_set, to capture exception conditions on the fd's.  This could be required to make your program robust and/or to calculate the 'done' criteria.
0
 

Author Comment

by:sevrin
ID: 1293296
OK, I understand this up to your while(not done) loop. That is, when you say

set bits in fdReadSet;
set bits in fdWriteSet;

using my terminology you are saying

FD_SET(0, &fdvar);
FD_SET(1, &ffddvar);


But - and I suspect Stevens has omitted a macro detail here - I don't know what's going on when you do the memcpy. (I suppose bcopy is equivalent, but as I'm using Linux I imagine both are acceptable.) What, exactly, is being copied to the second and third arguments of select? Also, what sort of condition is the while testing for?

As this question is turning out to be more complex than I expected, out of fairness to you I'm upping the number of points.

Best,

David.


0
 

Author Comment

by:sevrin
ID: 1293297
Adjusted points to 60
0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293298
As far as the 'not done' condition is concerned, this can be any boolean expression evaluating to false when your program logic indicates the end of processing.  For the simple copy stdin to stdout it can be a boolean that is set false when the read on stdin returns 0 bytes (meaning eof on input).  As you know by now, there is always a catch with the select thing.  Setting the boolean at eof on input will cause the last buffer read not to be written (the while loop will exit).  So, you will need to set a boolean to indicate eof on input and than stay in the loop until all of the buffer has been written to output.  Only than you should set the while condition to false.
The memcpy and bcopy should both work, but note that the order of the first two arguments (src and dest) is different.
An fd_set variable is (for example) implemented as a structure containing an array of long (32 bit) integers.  Every bit in this array  corresponds to an fd that is set/reset for checking by the select.  The FD_ macros makes life easier for you so you do not have to calculate the exact bit position in the array elements.  This is also a good thing for portability of your program to other platforms where a long might not be 32 bits long.  
The reason I memcpy the fd_set(s) is to build the sets once using FD_ZERO/FD_SET and use a 'work' fd_set variable that will be modified by the select() call.  Otherwise, you will have to rebuild your fd_set(s) every time round in the loop.  Again, for the stdin/stdout this is simple, but for other programs using a lot of files or sockets this can be more complex and time consuming.
The call
  memcpy(&fdRead, &fdReadSet, sizeof(fd_set));
will copy the structure bit for bit to the work variable.

0
 

Author Comment

by:sevrin
ID: 1293299
So when you write:


      The reason I memcpy the fd_set(s) is to build the sets once using FD_ZERO/FD_SET and use a 'work' fd_set
      variable that will be modified by the select() call.  Otherwise, you will have to rebuild your fd_set(s) every time
      round in the loop.

you're merely saying that the memcpy is a convenient alternative to FD_SET and FD_ZERO? But that because we don't want to clear the bits in  fdset every time we go through the loop, we're really *obliged* to do the memcpy? Please confirm that I'm understanding you correctly.

One more question before I have a go at actually writing the program: when you have the "for every bit set in fdWrite/fdRead" loops, I take it I'll have to use the FD_ISSET macro?

Thanks again.
0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293300
1: The memcpy issue:
Remark that you do not have to use the memcpy, and do not panic if you do not understand the following reasons why it can be useful.  Just go ahead implementing the stdin/stdout copier without it (rebuilding the sets using FD_ZERO and FD_set before the select call°.  Once you start refining your implementation, you will run into the problem of keeping 'state' (see below).  At that time, just remember the memcpy alternative exists and maybe reread the description below.

The memcpy is really only a convenient and more performant alternative.  Lets say we want to check fd 10 to 50 for input, and 51 to 99 for output.  In the while loop, you could do the following:
while (...)
{
    FD_ZERO(&fdRead);
    for (i = 10; i <= 50; i++)
    {
        FD_SET(i, &fdRead);
    }
    FD_ZERO(&fdWrite);
    for (i = 51; i <= 99; i++)
    {
        FD_SET(i, &fdWrite);
    }
    nRdy = select(nHighBit, &fdRead, &fdWrite, 0, NULL);
    ...
}

After the select call, all the bits between 10 and 99 will be reset except for the ones that are ready for input or output, respectively.  The next time through the loop, all the bits will be turned on again by executing the same code.

Now imagine that the bits you want to turn on is only a selection of the possible range.  For example, you want to read on the even descriptors, and once you read something on an fd, you want to write the buffer to fd+1 (the next odd one).  You cannot set all odd bits on every time through the loop, because most of the time they will all be ready for output, even if you have nothing in the buffer to write.  The select call will return instantly and although you can test to see if you really want to write, this is executing a 'busy loop', meaning you will be using almost 100% of your processing power excuting code to detect you have nothing to do really.  The select call is created to avoid such busy loops.
This is where the use of two sets comes in handy.  In one set (like the fdReadSet in my example), the program logic will set and reset the bits dynamically when the conditions change during execution.  For example, you will want to reset the bit of an fd when the corresponding buffer fills up, and set it again when part of the buffer has been written out to the corresponding fd+1.  The bits in the fdWriteSet for fd+1 will be set only when there has been a read on fd, and will be reset when the buffer has been written out completely.
By using the fdReadSet and fdWriteSet as 'memory' about the state of the different fd's during the execution of your program, and by using a copy of the state memory (the fdRead and fdWrite after the memcpy) that will be modified by the select call, you can avoid rebuilding the fdRead and fdWrite from scratch every time through the loop.  Any other solution will force you to keep the state of every fd in some variable or structure and implement the rebuilding of fdRead and fdWrite using conditional statements testing these states and setting/resetting the corresponding bits.
(Don't worry, it is easier done than said.)

2: FD_ISSET
A simple answer this time: yes.

Good luck.
Bert
0
 

Author Comment

by:sevrin
ID: 1293301
Thanks for this; what you say is very clear. I have just a couple of brief questions about FD_ISSET. (Incidentally, I'm upping the points again.) Specifically, Stevens only ever *mentions* FD_ISSET, he doesn't actually say how to use it (what it returns, etc.) So the following is my guess as to how to implement your "for every bit set in fdWrite(lower than nHighbit) do the output io....":

      if FD_ISSET(1, &fdwrite) = 0
      write(1, buff, nbytes);
      ...

I'm just guessing that FD_ISSET returns 0 in this circumstance; is this correct? Also, in your schematic why did you choose to test the write bits before the read bits?

Thanks,

David

P.S. As there are only two bits of interest, I've replaced your "for every" with an "if". Is this OK?
0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293302
1. FD_ISSET is a (#define) macro and evaluates to a boolean value.  That means, it 'returns' 0 for false or non-zero for true.  So you can simply test:
  if (FD_ISSET(1, &fdWrite))
    write(1, buff, nbytes);
or, more obscure in case of a boolean value:
  if (FD_ISSET(1, &fdWrite) != 0)
    write(1, buff, nbytes);

Never test for value 1 if you mean 'true'.  Anything non-zero is true, an you should never depend on the fact that a 'boolean' function or macro uses the value 1 for this.
 
2. Doing writes before reads can be (womewhat) important when your buffer is filling up to a point that you must suspend reading until part of the buffered data is written out.  If you do writes first, the probability of filling the buffer is lower.  However, do not suppose that you can get away with no buffering if you first read and than write in the same pass in the loop.  The write fd could be unready, or the write could be partial, etc...  A good buffering scheme and the use of select() goes together.

3. 'if' instead of a 'for' loop:
Yes, in this case you can just use the if's.

Remark:  in your example, you use '=' when you mean '==' to test for equality.  This is a logical error that is sometimes hard to find.  In this case, compilation will fail because you cannot assign the value 0 to FD_ISSET, but if the left side would have been a variable, this would be a valid syntax:  The value of the assignment would have been 0, and the if would always fail since 0 means false.  (Sorry if this was only a typo on your part and you now al this).
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 

Author Comment

by:sevrin
ID: 1293303
I've had a crack at writing the program; and while it compiles OK, all it does is print half a page (1024 bytes?) of garbage. Any suggestions?

Best,

David


#include <stdio.h>
#include <sys/types.h>
#include <sys/time.h>
#include <errno.h>

#define nHighBits      2
#define number            1024

int readn(int fd, char *ptr, int nbytes);
int writen(int fd, char *ptr, int nbytes);


main(int argc, char *argv[])
{

int nRdy, n;
int flag = 2;
char *buff;

fd_set fdRead, fdWrite, fdReadSet, fdWriteSet;
FD_ZERO(&fdReadSet);
FD_ZERO(&fdWriteSet);
FD_SET(0, &fdReadSet);
FD_SET(1, &fdWriteSet);

while(flag)
      {
      bcopy( (char *) &fdReadSet, (char *) &fdRead, sizeof(fdReadSet) );
      bcopy( (char *) &fdWriteSet, (char *) &fdWrite, sizeof(fdWriteSet));
      nRdy = select(nHighBits, &fdRead, &fdWrite, (fd_set *)0, NULL);
            if (nRdy == 0)
            exit(errno);
            if(FD_ISSET(1, &fdWrite) != 0)
                  {
                  writen(1, buff, number);
                  FD_SET(1, &fdWrite);
                  nRdy--;
                  }
            if(FD_ISSET(0, &fdRead) != 0)
                  {
                  readn(0, buff, number);
                  FD_SET(0, &fdRead);
                  nRdy--;
                  }
            flag = nRdy;      

       }
}


/*
 * Read n bytes from a descriptor
 */

int readn(int des, char *pointer, int numbytes)
{
      int nleft, nread;

      nleft = numbytes;
      while(nleft > 0)
            {
            nread = read(des, pointer, nleft);
            if(nread < 0)
                  return(nread);            /* error */
            else if(nread == 0)
                  break;                  /* EOF */
            
            nleft -= nread;
            pointer += nread;
            }
            return(numbytes - nleft);      /* return >= 0 */
}

int writen(int fd, char *ptr, int nbytes)
{
      int nleft, nwritten;
      nleft = nbytes;
      while(nleft > 0)
            {
            nwritten = write(fd, ptr, nleft);
            if (nwritten <= 0)
                  return(nwritten);      /* error */

            nleft -= nwritten;
            ptr   += nwritten;
            }
            return(nbytes - nleft);
}

0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293304
The 1024 bytes of garbage are simple:
1. You try to select fd 1 (stdout) unconditionaly (wheter you have something to write or not).  You could get away with this, IF you check before doing the actual write that there is something in the buffer.  But you do not check that, so you will in the first pass write 1024 bytes that never got read in, initialized or even allocated in your program (the 'buffer' pointer is only declared, but it isn't set to any static or dynamic allocated memory).  You where somewhat unlucky that the program didn't die on a segmentation fault.  This is always a clear indication that there is some runaway pointer in your code.
In a good implementation of the select call, you should not set the bit for the fd 1 in the fdWriteSet UNLESS you are sure that you have some data to write.  Otherwise, the program will do a busy loop and you could as well (or better, as bad) do without the select.
2.  The 'flag' variable will be 0 the first time through the loop (both write and read fd selected and handled), so you program will stop .
So, what do you need to fix  (sorry, the list is somewhat long):
1. A buffer must be allocated (use a static array or malloc some memory).
2. You should really try to select fd 1 only if the buffer is not empty.  If you have nothing to write, do not try to select the fd 1, because it will succeed most of the time, causing your program to  do a busy loop.
3. The 'flag' to control your outer while loop must be true (non-zero) until you have reached EOF on the read AND all of the buffer is written to the output.
4.  The readn and writen functions cannot do multiple reads and writes.  Although the basic idea is good, the loop to do multiple reads or writes must be integrated in the main loop, making sure that you pass through select() before every IO on a given fd.


Could you give me an idea of your level of experience in C ?  What I would like to know is that a phrase like 'malloc the memory' is clear enough to you.  This could limit the typing I'm doing and not bothering you with simple details.

Some ideas (I am not going to write the code, it's your exercise).

while(flag)
{
  bcopy( (char *) &fdReadSet, (char *) &fdRead, sizeof(fdReadSet) );
  bcopy( (char *) &fdWriteSet, (char *) &fdWrite, sizeof(fdWriteSet));
  nRdy = select(nHighBits, &fdRead, &fdWrite, (fd_set *)0, NULL);
  if (nRdy == 0)
    exit(errno);

  if(FD_ISSET(1, &fdWrite) != 0)
  {
    // try to write all of the buffered data to fd 1
    // if the write was partial, remember where you will have to start your next write (by using pointers or by shifthing your buffer data so you can always start at teh beginning...
    // reset the bit in fdWriteSet for fd 1 if  the buffer is empty
    // if the buffer is empty and the eof_flag is true, set the 'flag' variable to false to exit the program.
    nRdy--;
  }

  if(FD_ISSET(0, &fdRead) != 0)
  {
    // try to read a number of bytes (1024 is ok)
    // add the data at the end of the current buffered data
    // check for eof and set a global eof_flag
    // set the fd 1 in fdWriteSet if the data buffer is not empty
    // reset the fd 0 in fdReadSet if eof_flag is true
    nRdy--;
  }
  // assert(nRdy == 0), just to make sure
  //
  flag = nRdy;
}


Try to understand the use of setting/resetting the bits in relation to the buffer contents.  When you did get the program to work, add additional code to avoid overflowing your buffer when reading in data.  You will have to add logic to set/reset the bit for fd 0 in fdReadSet for this.  In a first implementation, make your buffer large enough (e.g. 32KBytes) an read only a maximum  of 1024 bytes in a read.  This will give you enough room to avoid the buffer filling up before it gets flushed to the output.

Bert
0
 

Author Comment

by:sevrin
ID: 1293305
Bert

Thanks again; this is useful. To answer your query: I've worked through most of the exercises in Kernighan and Ritchie's book on C, so I suppose I'm past the absolute novice stage in C (I do tend to forget things, though). So I *do* know about malloc, and the need to allocate memory wherever pointers are involved. I was under the impression, though, that when network I/O is going on (as opposed to "ordinary" programs) memory is automatically allocated by the system itself. Certainly none of Stevens' networking programs malloc anything. Perhaps you could clarify exactly where the dividing line is.

With regard to your point 2), I see what you mean. I think maybe my interpretation of what the program was supposed to do was different from yours. I thought it was supposed to do something like: check for input; do the input; check the output; do the output; then stop. Part of the problem for me is that the goal of the program is to do something that the system does anyway. A hardware analogy would seem to be: "design a circuit to conduct electricity".

Anyway, if you could answer the above, I'll then get on with the program. (You'll notice that I've upped the points again, and I will do so once more when I hear from you.)

David

P.S. I've seen references to a function called fdopen. I have no documentation on it, though: do you know anything about it, and if so, could it be useful in the case of this example?
0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293306
fdopen:
-----------
I do not think fdopen can serve any purpose here.  In fact, I never had a use for this function yet (but I am sure it can serve useful purposes in certain cases).  Anyway, stdin and stdout are streams by themselves, no need to make other streams for fd 0 and/or 1.

Here is some info on fdopen from the help file of MS-Visual C.

FILE *_fdopen( int handle, const char *mode );

The _fdopen function associates an input/output stream with the file identified by handle, thus allowing a file opened for low-level I/O to be buffered and formatted. The mode character string specifies the type of access requested for the file, as shown below. The following list gives the mode string used in the fopen and _fdopen functions and the corresponding oflag arguments used in the _open and _sopen functions. A complete description of the mode string argument is given in the fopen function.

Memory allocation
-----------------------------
The guideline is very simple:
 >>  Allocate memory for every byte your program will use <<
You should not depend on memory being magically allocated by some other library or system program to store anything yourself.
When you say 'memory is allocated by the system', there certainly is, both for networked or local file I/O.  However, the system follows the same guideline.  This memory is for its own use.  When you do a read or a write, you are transferring data between your buffer and the buffer allocated by the system.  The  'real' I/O will be done by the system into or from its own buffers later on (there is a thing called unbuffered IO, but explaining all this gets complicated very quickly).
So, while I never actually have seen Stevens' programs, I am sure that they use memory declared by the program itself.  Remark that allocation of memory can be done statically (by declaring variables or arrays) or dynamically (by using malloc).
For example:

  char *pStatic, *pDynamic;
  char chStatic[1024];

  // point to statically allocated memory
  pStatic = chStatic;  // same as &chStatic[0]

  // point to dynamically allocated memory
  pDynamic = (char*)malloc(1024);

Once the pointers ar set to some memory, the fact that it was malloc'd or directly declared is of no importance.  The memory belongs to the program.  I am convinced that one of both forms is used in Stevens' examples.


Reinventing the wheel
------------------------------------
As far as doing something the system already does for you, remember that this is an exercise.  Most of the things you do when learning something has already been done.  Otherwise, without a reference, how would you know that you solved the problem correclty ?  Just getting this 'simple' application of the select call working is not a trivial task, so you should be happy that the underlying problem is not more complicated.  Just imaging having to design your 'circuit' without knowing what electricity is supposed to do.

Bert
0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293307
The last line in my example program for the select call that says:
  flag = nRdy;
should not be there.  I just noticed this now.  Sorry, it was the copy&paste devil that tricked me into this :-)
0
 

Author Comment

by:sevrin
ID: 1293308
Thanks again for your comments. My new version of the program is as follows. I think I've addressed all your criticisms, but at the moment the program doesn't do much - just hangs. Obviously there's some minor (I hope it's minor) logical flaw that I can't spot because I'm too close to the program at the moment. If you can tell me what it is I'll be very grateful!

David

#include <stdio.h>
#include <sys/types.h>
#include <sys/time.h>
#include <errno.h>

#define nHighBits      2
#define number            1024

int readn(int fd, char *ptr, int nbytes);
int writen(int fd, char *ptr, int nbytes);


main(int argc, char *argv[])
{

int nRdy, n, p, q;
int readflag = 0;
int writeflag = 0;
int flag = 1;
char *buff = (char *) malloc(32000*sizeof(char));

fd_set fdRead, fdWrite, fdReadSet, fdWriteSet;
FD_ZERO(&fdReadSet);
FD_ZERO(&fdWriteSet);
FD_SET(0, &fdReadSet);
FD_SET(1, &fdWriteSet);

while(flag)
      {
      bcopy( (char *) &fdReadSet, (char *) &fdRead, sizeof(fdReadSet) );
      bcopy( (char *) &fdWriteSet, (char *) &fdWrite, sizeof(fdWriteSet));
      nRdy = select(nHighBits, &fdRead, &fdWrite, (fd_set *)0, NULL);
            if (nRdy == 0)
            exit(errno);
            while ((FD_ISSET(1, &fdWrite) != 0) && readflag != 0 )
                  {
                  p = write(1, buff, number);
                        if ( *(buff + p) == EOF)
                        {
                        writeflag = 1;
                        FD_SET(1, &fdWrite);
                        }
                        else buff = buff + p;
                  }
            while (FD_ISSET(0, &fdRead) != 0)
                  {
                  q = read(0, buff, number);
                        if (q != number)
                              {
                              if (*(buff + q) == EOF)
                                    {
                                    readflag = 1;
                                    FD_SET(0, &fdRead);
                                    break;
                                    }
                              else
                                    {
                                    buff = buff + q;
                                    continue;
                                    }
                              }
                        else
                        {
                              if (*(buff + q) == EOF)
                                    {
                                    readflag = 1;
                                    FD_SET(0, &fdRead);
                                    break;
                                    }
                              else      
                                    {
                                    buff = buff + q;
                                    continue;
                                    }
                        }
                  }
      if ( (readflag == 1) && (writeflag == 1) )
      flag = 0;

       }
}


0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293309
This is a test.  My modem is hanging up every time I try to get through.
0
 

Author Comment

by:sevrin
ID: 1293310
Reading you loud and clear....
0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293311
Lets try some more (sorry for these problems)

I do not know why your program is not doing anything, but it couldn't work correctly if it did something.  It has three major flaws:
1. It does not go through a select before every read and/or write operation.  Although it could execute correctly, this is not the way to create a program with select().
2. It doesn't handle the buffering correctly when point 1 would be  fixed.  You suppose you will first read everything and only than write it out.
3. It does a busy loop (fd 1 almost always ready for output) when no data is available to be written.

Since you have come a long way from your first version, I think the best thing to give you is an implementation of the things we have talked about.  This will make the loose ends clear, I hope.  So, here is my modified version of your program.  I hope the comments are clear, the program has the structure I used in the previous response (with buffer overflow checking included) and I kept your code as much as possible.

The major modifications are:
1. Replace inner while loop with if
   You can do one and only one IO on a selected fd.  After that, you must go through the select() again before doing the next IO
2. The setting/resetting of fd's during the execution of the program must be done on the fdReadSet and fdWriteSet.  The fdRead and fdWrite are copies that will be modified by select().
3. Start with fd 1 NOT selected.  Only set the bit when there is data in the buffer, otherwise your program will do a busy loop falling thru the select() call even if there is no data to be read or written
4. Implement buffer shifting.  The idea is that a read adds data to the end of the buffer (after data that has not been written out).  A succesfull write will cause the contents of the buffer to shift left, removing the part that was written out and leaving everyting else for the next write.  Example:
    start of program:  nBuffLen == 0, no data in buffer
     read 1024 bytes :  nBuffLen == 1024, buff[0]..buff[1023] contain data
     write 512 bytes :  buff[0]..buff[511] written out (partial write)
                        buff[512]..buff[1023] left to write, copy this data
                        to buff[0]..buff[511] (shift to start of buffer)
                        nBuffLen == 512
     read 1024 bytes :  nBuffLen == 12+1024,buff[0] .. buff[511+1024] is data
     etc...
5. Reset fd 0 for read when the buffer is full, and reselect it after a write has made room for more data (unless eof has been encountered).

0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293312
Ok, it could be a copy&paste problem.  I will try to get the other part of the response thru.
0
 
LVL 2

Expert Comment

by:bertvermeerbergen
ID: 1293313
#include <stdio.h>
#include <sys/types.h>
#include <sys/time.h>
#include <errno.h>

#define nHighBits 2
#define number 1024

/* macro returns minimum of two arguments */
#define Min(a,b)  ((a) < (b) ? (a) : (b))
/* use defined constant for size, needed later in program */
#define BUFF_SIZE 32000

main(int argc, char *argv[])
{
    int nRdy, n, p, q;
    int readflag = 0;
    int writeflag = 0;
    int flag = 1;
    /* use defined constant for size */
    char *buff = (char *) malloc(BUFF_SIZE*sizeof(char));
    int nBuffLen = 0;  /* number of chars currently in the buffer */

    fd_set fdRead, fdWrite, fdReadSet, fdWriteSet;
    FD_ZERO(&fdReadSet);
    FD_ZERO(&fdWriteSet);
    FD_SET(0, &fdReadSet);
    FD_CLR(1, &fdWriteSet);  /* not yet any data in buffer */
    while(flag)
    {
        bcopy( (char *) &fdReadSet, (char *) &fdRead, sizeof(fdReadSet) );
        bcopy( (char *) &fdWriteSet, (char *) &fdWrite, sizeof(fdWriteSet));
        nRdy = select(nHighBits, &fdRead, &fdWrite, (fd_set *)0, NULL);
        if (nRdy == 0)
            exit(errno);

        /* do not use while loop here, you are allowed to do one and only one write on a selected fd */
        /* no need to test if data is available, the fd is only selected while there is something left to write in the buffer */
        if (FD_ISSET(1, &fdWrite))
        {
            /* try to write all data currently in the buffer */
            p = write(1, buff, nBuffLen);
            if (p < 0)
            {
                /* some error reporting needed (your exercise) */
                exit(errno);
            }
            if (p < nBuffLen)
            {
                /* partial write: range buff[0]..buff[p-1] has been written
 remove this data by shifting buff[p]..buff[nBuffLen-1] */
                bcopy((char*)&buff[p], (char*)buff, (nBuffLen - p));
            }
            /* part or all of the buffer is flushed, adjust length */
            nBuffLen -= p;
            if ((nBuffLen < BUFF_SIZE) && (readflag == 0))
            {
                /* select fd 0 for read, because there is room in buffer */
                FD_SET(0, &fdReadSet);
            }
            if (nBuffLen == 0)
            {
                /* no more data left, unselect fd 1 until data available */
                FD_CLR(1, &fdWriteSet);
            }
        }

        /* do not use while loop here, you are allowed to do one and only one read on a selected fd */
        if (FD_ISSET(0, &fdRead))             /* test as a boolean */
        {
            /* read 'number' bytes or less if not enough room in buffer */
            q = read(0, &buff[nBuffLen], Min(number, BUFF_SIZE), BuffLen)));
            if (q < 0)                        /* read error */
            {
                    /* some error reporting needed (your exercise) */
                    exit(errno);
            }
            if (q == 0)                       /* 0 means EOF on read */
            {
                readflag = 1;                 /* set eof indicator */
                FD_CLR(0, &fdReadSet);        /* UNselect fd 0 for read */
            }
            /* additional data read in the buffer, adjust available length */
            nBuffLen += q;
            /* select fd 1 for write, because there is data in the buffer */
            FD_SET(1, &fdWriteSet);
            /* UNselect read until the buffer has room for more data */
            if (nBuffLen == BUFF_SIZE)
            {
                FD_CLR(0, &fdReadSet);
            }
        }
        /* continue loop until eof on input and buffer flushed to output */
        if ( (readflag == 1) && (nBuffLen == 0) )
            flag = 0;
    }
}

This version compiles and runs ok under Linux (for the few tests I did, in any case).
So, I think this is the response to your original question.  I would suggest to examine the code, and if something is not clear, feel free to ask.  If you agree, my next posting will be an answer.  Although a lot more can be said about select() and possible pitfalls (I'll give you some hints to change the program for the better in my answer), this program implements the basic idea. If you have other questions, consider making a new posting and get other people with fresh ideas into the discussion.
0
 

Author Comment

by:sevrin
ID: 1293314
This program makes clear much that previously was ambiguous. I also didn't know about the FD_CLR macro. So, yes, let your reply to this posting be your answer, and I'll give you your 100 points.

I do have a couple of questions; they pertain to the situation in your code where you deal with a partial write.
1) Why did you adopt the array form rather than the pointer arithmetic form that I used?
2) I don't quite see what is happening with the bcopy here. You say that elements 0 to p-1 of buff have been filled; therefore these must be shifted. Fair enough - but you seem to shift them to the *end* of buff. Why is this? I'd have thought that these elements would stay where they are, and that the next lot would be tacked on *after* them. How exactly do buffers fill up anyway?

Thanks for your time and trouble - you can be sure I'll be giving you an "excellent" grade!

David
0
 
LVL 2

Accepted Solution

by:
bertvermeerbergen earned 100 total points
ID: 1293315
Not explicitly mentioning the FD_CLR was my mistake, I was using the term reset instead of clear.  If it is any comfort to you, when I changed and compiled the code, I got a syntax error saying that FD_RESET did not exist, and I had to look up the FD_CLR (after which a second syntax error occurred, because I used FD_CLEAR, this doesn't exist either :-)

1) array indexing versus pointers.

Well, it really makes no difference in C.  The two expressions
  &buff[nBufLen]
  buff + nBufLen
mean exactly the same, and if you would look at the generated code, this would be identical in both cases.  In C, the name of an array is replaced by the value of the starting address of the array by the compiler.  However, do not think that the array name is a 'pointer' to the starting address.  It is a symbol that gets replaced by a constant value (the address of the array) at compile time.  So, the following code is also equivalent:
  &buff[0]
  buff
and you pick the one that you like most in any given situation.

Sometimes, it is more readable to use indexing syntax, as in
  buff[iOffset] = 'x';
instead of
  *(buff + iOffset) = 'x';

In other cases, pointers would be better, as in
  pch = buff + iOffset;
instead of
  pch = &buff[iOffset];

So I suppose you could say I should have used the pointer syntax, but the program will work exactly the same way.  


2.  bcopy((char*)&buff[p], (char*)buff, (nBuffLen - p))
The first argument to bcopy is the source, the second is the target address.
a) (nBuffLen - p) is the length of the data in the buffer that has not been written out yet
b) &buff[p] is the address in the buffer of the first character that has not been written out, all preceding characters have been written to the output.
c) buff is the startting address of the array

So, the bcopy will shift the data in the range buff[p .. nBuffLen] to the start of the buffer (as in buff[0 .. nBuffLen]), thereby overwriting the data that was written to the output before.  The next write (after another select) just goes ahead writing from the start of the buffer for the length indicated by nBuffLen.

A read will append to the end of the data that is still left to write in the buffer, because &buff[nBuffLen] is the first address in the buffer that does not contain relevant data.  The read will try to add a maximum number of characters to the buff, making sure that no data is added after the end of the buffer.  The latter would cause other allocated memory to be overwritten, probabely causing an hard to find error later in the program.  C will not protect you against this kind of error.  There is no 'array bounds' checking.  In fact, there is no such thing as an array in a compiled C program.  When you define an array as
  char buff[32000];
you just reserve 32000 bytes at address 'buff' and the compiler will put the next variable you define at address buff+32000.
Afterwards, you can use anything like
  buff(-50), or buff[65000], or *(buff - 50), or *(buff + 65000)
and as long as the resulting address is valid (in your process allocated virtual memory space), this will work.  However, you will use or overwrite other data, causing problems at runtime.  If you are lucky, the program will die when you go out of bounds because the virtual address is not allocated, causing the famous segmentation fault.  Let me say that most of the time, you will not be so lucky, and strange behaviour of the program will occur long after you went out of bounds.

It was no trouble helping you out, I am always pleased to see that there are still people that do not mind to tackle and understand more than the simple or even trivial programs.  Keep up this attitude, I.T. needs it more than ever.

Bert

0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

This is to be the first in a series of articles demonstrating the development of a complete windows based application using the MFC classes.  I’ll try to keep each article focused on one (or a couple) of the tasks that one may meet.   Introductio…
Exception Handling is in the core of any application that is able to dignify its name. In this article, I'll guide you through the process of writing a DRY (Don't Repeat Yourself) Exception Handling mechanism, using Aspect Oriented Programming.
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now