[Webinar] Streamline your web hosting managementRegister Today

x
?
Solved

Read files

Posted on 2006-05-18
6
Medium Priority
?
303 Views
Last Modified: 2010-05-18
Hi there,

Could you please tell me explicitly how to use open/read (with file descriptor0 AND fopen/fread (with file poiner) to read n bytes from a binary file until EOF? I want to know

1) the decimal value of the n bytes (0,1,3,5,99,100 etc), and chars represented by the n bytes
2) for each byte in the n bytes, the individual decimal value of the byte, and the char represented by this byte

It is a follow-up of http://www.experts-exchange.com/Programming/Programming_Languages/C/Q_21854909.html. I simply want a quick answer ;-)

Thanks!
0
Comment
Question by:tiger0516
  • 3
  • 2
6 Comments
 
LVL 23

Assisted Solution

by:brettmjohnson
brettmjohnson earned 800 total points
ID: 16710535
The simplest way to process individual characters from files is to use the character-oriented buffered file I/O standard library calls, getchar(), getc(), and fgetc().   The following filter takes the data on its standard input and prints out the decimal value and character value of each byte it reads.

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
{
  int ch;
  while((ch = getchar()) != EOF) {
    // print decimal and character value of ch
    printf("%d\t%c\n", ch, ch);
  }
  return 0;
}

If you really want to use read() or fread(), you could use them to read 1 byte at a time [inefficient], or read blocks of data into a buffer, then iterate over the buffer.  [Effectively moving the buffering from the standard library to your program.]  Here is the same program, using read() rather than getchar():

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>

int main (int argc, char **argv)
{
  int bytes_read, i;
  char buffer[1024];

  while((bytes_read = read(STDIN_FILENO, buffer, sizeof(buffer))) > 0) {
    for (i = 0; i < bytes_read; i++) {
      // print decimal and character value of ch
      printf("%d\t%c\n", buffer[i], buffer[i]);
    }
  }
  return 0;
}


If you want to use fread() rather than read(), then:

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
{
  int bytes_read, i;
  char buffer[1024];

  while((bytes_read = fread(buffer, sizeof(char), sizeof(buffer), stdin)) > 0) {
    for (i = 0; i < bytes_read; i++) {
      // print decimal and character value of ch
      printf("%d\t%c\n", buffer[i], buffer[i]);
    }
  }
  return 0;
}

0
 
LVL 1

Author Comment

by:tiger0516
ID: 16710566
Oho, maybe some background of the question will be helpful:

I was given some files (binary file) and my task is to read data in it. Each file is in big endian or in little endian, or none of them.

The criteria of endian is to the 1st 4 bytes of the file, if all 0, little endian; if all 1, big endian; otherwise, error.
[quote]
After the first 4 bytes, the remainder of the file consists of a sequence of bytes that are logically grouped into pairs. A pair consists of a 4 byte value, in the endian format of the file, that identifies the type of the 2nd element of the pair, which is also stored in the endian format of the file. Depending upon the type, the 2nd element of the pair will occupy a certain number of bytes and then the next pair will start. The allowed types for the tuples are a string, a short integer, or a long integer.

The type codes are:

2 for a short
4 for a long
a negative number for a string. The absolute value of the negative numbers is the number of bytes in the string
Any other value is an error.
[/quote]

My understanding to the above is

1) Read the 1st 4 bytes
2) If the 1st 4 bytes tell me it is big endian or little endian, then I will stick to this endian way to read rest info, regardless of the endian of my computer
3) Read the next 4 bytes (#5-#8 bytes of the file) in the endian format of the file. The value of this 4 bytes will be either
a) 2 b)4 c) a negative d) none of a,b,c.
4) If return from 3) is case a,b or c, then I read next 2/4/abs(the negative) bytes (let's say n) respectively. Value represented by those n bytes will be a short int, long int, a string, respectively.
5) If EOF, exit.

Because the endian format of the file may be different with my computer, if they match, I just read. Otherwise, I need to have a swap fuction to swap bytes (or read in reversed way). Say, my PC is in little endian, the file says it is in big endian. For example, 000299, if big endian format, it will be read at the value of the 4st 4 bytes are (left to right) is 2, so it is followed by 2 bytes' short int, and that's 99. However if in little endian, it will be read at (right to left), 2000, which is an int not in valid type code, which is an error, and it triggers the program to exit.

Is my understanding right?

I am thinking to complete this task by

1) use open/read OR fopen/fread to open & read file. sunnycoder suggests fopen/fread over open/read in the other thread. I agree with him since I prefer high level operation.
2) examine returned value (not the direct return value of read or fread) , put it in the context aforementioned

I have some questions. i.e:

1) If I have fread(buffer,1,4,fp) somewhere in the code already, if I write fread(buffer,1,4,fp) again, will it start from the every beginning, instead of of the next position indicated by the first call? It seems that it starts over again
2) I say

while (fread(buffer,1,4,fp)>0)
{
for (i=0; i<4; i++)
            {
                  // To-Do
            }
}

For example, I want to see if the 1st 4 bytes are all 1 or 0. My intuitive idea is to compare if  buffer[0]~buffer[3] all equal to  0 or 1. But that sounds stupid, can I take them for a whole and compare?

3) I tried my code and it sometimes say the 1st four bytes of some file is -1 -1 -1 -1 (or ffffffff, depending on the format controller in printf("%x ",buffer[i]) ). Recall the requirement says it is possible to be a negative value, does that mean it is in Two's complement (if viewed in binary format) ? I do not think it should be so complicated.

4) If endian formats of my PC and the file do not match, I need to read in a reversed way. My intuitive idea is to write two swap fuctions, one for PC=big/File=little and one for PC=little/File=big. But I do not think that's good. What shall I do to write a smart swap?

So many questions. Thanks.
0
 
LVL 1

Author Comment

by:tiger0516
ID: 16710693
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
                                  ^^^^^
Should it be char *argv[]? BTW, argc and argv are not used in the code. Doesn't getchar have no argument (input from stdin, instread file. getc from file?)?

{
  int ch;
  while((ch = getchar()) != EOF) {
    // print decimal and character value of ch
    printf("%d\t%c\n", ch, ch);
  }
  return 0;
}
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
LVL 1

Author Comment

by:tiger0516
ID: 16710733
I input code you gave:
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
{
  int bytes_read, i;
  char buffer[1024];

  while((bytes_read = fread(buffer, sizeof(char), sizeof(buffer), stdin)) > 0) {
    for (i = 0; i < bytes_read; i++) {
      // print decimal and character value of ch
      printf("%d\t%c\n", buffer[i], buffer[i]);
    }
  }
  return 0;
}

complie with gcc -o test test.c, run  ./test file.dat, but there is no response; I have to use ctrl-c to quit.
0
 
LVL 45

Accepted Solution

by:
sunnycoder earned 1200 total points
ID: 16710819
#include <stdio.h>

int main ()
{
      char buffer[4];
      int ret;
      int i;
      char * p;


      FILE * fptr = fopen ("test.txt","r");

      if (!fptr)
      {
            printf("could not find file .. exiting");
            return 1;
      }

      while ((ret=fread(buffer,1,4,fptr))>=0)
      {
            p = buffer;

            for (i=0; i<ret; i++)
            {
                  printf("%x ",*p++);
            }

            if (feof(fptr))
            {
                  printf("\nReached end of file ... exiting\n");
                  return 0;
            }
      }
      return 0;

}
0
 
LVL 45

Assisted Solution

by:sunnycoder
sunnycoder earned 1200 total points
ID: 16711086
>1) If I have fread(buffer,1,4,fp) somewhere in the code already, if I write fread(buffer,1,4,fp) again, will it start from the
>every beginning, instead of of the next position indicated by the first call? It seems that it starts over again
Unless you closed and reopened the file, it will read the next 4 bytes and NOT from the start

>But that sounds stupid, can I take them for a whole and compare?
Problem with taking 4 bytes at once is endianess as you indicated in the problem ...
0000 0001 0000 0001 0000 0001 0000 0001
is 0x01010101 for one and 0x80808080 for another

You can compare the values in a loop.

>Recall the requirement says it is possible to be a negative value, does that mean it is in Two's complement (if viewed in
>binary format) ? I do not think it should be so complicated.
Yes .. negative values are stored in 2's complement form ... what is the complication you find? Also, why do you need to compare in hex .. you can as well compare in decimal ... value == -1 or value ==1 is lot more intuitive than hex notations

>But I do not think that's good. What shall I do to write a smart swap?
Swap simply swaps the values at two locations ... How would the two swap functions be different ? You can call them from different locations to accomplish the conversion but swap would essentially remain the same!!

Cheers!
sunnycoder
0

Featured Post

Identify and Prevent Potential Cyber-threats

Become the white hat who helps safeguard our interconnected world. Transform your career future by earning your MS in Cybersecurity. WGU’s MSCSIA degree program was designed in collaboration with national intelligence organizations and IT industry leaders.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
There's never been a better time to become a computer scientist. Employment growth in the field is expected to reach 22% overall by 2020, and if you want to get in on the action, it’s a good idea to think about at least minoring in computer science …
The goal of this video is to provide viewers with basic examples to understand opening and writing to files in the C programming language.
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.
Suggested Courses

607 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question