Link to home
Start Free TrialLog in
Avatar of tiger0516
tiger0516

asked on

Read files

Hi there,

Could you please tell me explicitly how to use open/read (with file descriptor0 AND fopen/fread (with file poiner) to read n bytes from a binary file until EOF? I want to know

1) the decimal value of the n bytes (0,1,3,5,99,100 etc), and chars represented by the n bytes
2) for each byte in the n bytes, the individual decimal value of the byte, and the char represented by this byte

It is a follow-up of https://www.experts-exchange.com/questions/21854909/Open-Read-Write-files.html. I simply want a quick answer ;-)

Thanks!
SOLUTION
Avatar of brettmjohnson
brettmjohnson
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of tiger0516
tiger0516

ASKER

Oho, maybe some background of the question will be helpful:

I was given some files (binary file) and my task is to read data in it. Each file is in big endian or in little endian, or none of them.

The criteria of endian is to the 1st 4 bytes of the file, if all 0, little endian; if all 1, big endian; otherwise, error.
[quote]
After the first 4 bytes, the remainder of the file consists of a sequence of bytes that are logically grouped into pairs. A pair consists of a 4 byte value, in the endian format of the file, that identifies the type of the 2nd element of the pair, which is also stored in the endian format of the file. Depending upon the type, the 2nd element of the pair will occupy a certain number of bytes and then the next pair will start. The allowed types for the tuples are a string, a short integer, or a long integer.

The type codes are:

2 for a short
4 for a long
a negative number for a string. The absolute value of the negative numbers is the number of bytes in the string
Any other value is an error.
[/quote]

My understanding to the above is

1) Read the 1st 4 bytes
2) If the 1st 4 bytes tell me it is big endian or little endian, then I will stick to this endian way to read rest info, regardless of the endian of my computer
3) Read the next 4 bytes (#5-#8 bytes of the file) in the endian format of the file. The value of this 4 bytes will be either
a) 2 b)4 c) a negative d) none of a,b,c.
4) If return from 3) is case a,b or c, then I read next 2/4/abs(the negative) bytes (let's say n) respectively. Value represented by those n bytes will be a short int, long int, a string, respectively.
5) If EOF, exit.

Because the endian format of the file may be different with my computer, if they match, I just read. Otherwise, I need to have a swap fuction to swap bytes (or read in reversed way). Say, my PC is in little endian, the file says it is in big endian. For example, 000299, if big endian format, it will be read at the value of the 4st 4 bytes are (left to right) is 2, so it is followed by 2 bytes' short int, and that's 99. However if in little endian, it will be read at (right to left), 2000, which is an int not in valid type code, which is an error, and it triggers the program to exit.

Is my understanding right?

I am thinking to complete this task by

1) use open/read OR fopen/fread to open & read file. sunnycoder suggests fopen/fread over open/read in the other thread. I agree with him since I prefer high level operation.
2) examine returned value (not the direct return value of read or fread) , put it in the context aforementioned

I have some questions. i.e:

1) If I have fread(buffer,1,4,fp) somewhere in the code already, if I write fread(buffer,1,4,fp) again, will it start from the every beginning, instead of of the next position indicated by the first call? It seems that it starts over again
2) I say

while (fread(buffer,1,4,fp)>0)
{
for (i=0; i<4; i++)
            {
                  // To-Do
            }
}

For example, I want to see if the 1st 4 bytes are all 1 or 0. My intuitive idea is to compare if  buffer[0]~buffer[3] all equal to  0 or 1. But that sounds stupid, can I take them for a whole and compare?

3) I tried my code and it sometimes say the 1st four bytes of some file is -1 -1 -1 -1 (or ffffffff, depending on the format controller in printf("%x ",buffer[i]) ). Recall the requirement says it is possible to be a negative value, does that mean it is in Two's complement (if viewed in binary format) ? I do not think it should be so complicated.

4) If endian formats of my PC and the file do not match, I need to read in a reversed way. My intuitive idea is to write two swap fuctions, one for PC=big/File=little and one for PC=little/File=big. But I do not think that's good. What shall I do to write a smart swap?

So many questions. Thanks.
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
                                  ^^^^^
Should it be char *argv[]? BTW, argc and argv are not used in the code. Doesn't getchar have no argument (input from stdin, instread file. getc from file?)?

{
  int ch;
  while((ch = getchar()) != EOF) {
    // print decimal and character value of ch
    printf("%d\t%c\n", ch, ch);
  }
  return 0;
}
I input code you gave:
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
{
  int bytes_read, i;
  char buffer[1024];

  while((bytes_read = fread(buffer, sizeof(char), sizeof(buffer), stdin)) > 0) {
    for (i = 0; i < bytes_read; i++) {
      // print decimal and character value of ch
      printf("%d\t%c\n", buffer[i], buffer[i]);
    }
  }
  return 0;
}

complie with gcc -o test test.c, run  ./test file.dat, but there is no response; I have to use ctrl-c to quit.
ASKER CERTIFIED SOLUTION
Avatar of sunnycoder
sunnycoder
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial