Solved

Reading binary info from file

Posted on 2001-08-14
11
291 Views
Last Modified: 2013-11-15
i wanna read this "box.exe" file but cant figure out how i have something like so:


 FILE *stream;
   int i;
   int mask = 0x00FF;

   if( (stream = fopen( "Box.exe", "rb" )) == NULL )
      printf( "Couldn't open file\n" );
   else
   {
      /* Read a word from the stream: */
      i = _getw( stream );

      /* If there is an error... */
      if( ferror( stream ) )
      {
         printf( "_getw failed\n" );
         clearerr( stream );
      }
      else
         printf( "First data word in file: 0x%x\n", i);
      fclose( stream );
   }

this gives the value in hex but it is reversed it gives :
0x905A4D should be 0x4D5A90   ?
0
Comment
Question by:scooter1
  • 5
  • 3
  • 2
  • +1
11 Comments
 
LVL 4

Accepted Solution

by:
sdussinger earned 100 total points
ID: 6386899
This sounds like an endian issue. By reading an integer, you're telling the OS to apply constraints about how integers are stored. On some machines integers are stored with the high-byte first on others they are stored with the low-byte first.  When a machine reads an integer from storage, it interprets the bytes as an integer based on whether its a big-endian or little-endian machine.

Try this:

FILE *stream;
int i;
int mask = 0x00FF;

if( (stream = fopen( "Box.exe", "rb" )) == NULL )
   printf( "Couldn't open file\n" );
else
{
  // Read the four bytes and place them into the int.
  // By doing it this way, the OS can't know that we're
  // actually reading an int, so no endian modifications
  // will be made.
  int bytes = fread ((void *) &i, 1, sizeof (int), stream);

  if (ferror (stream) || bytes != sizeof (int))
  {
    printf ("fread failed");
    clearerr (stream);
  }
  else
    printf ("First data word in file: 0x%x\n", i);

  fclose( stream );
}

This allows us to read the first four bytes of the file without the OS doing it's endian constraints.

HTH.

--Steve
0
 

Author Comment

by:scooter1
ID: 6386959
that yields same results
0
 

Author Comment

by:scooter1
ID: 6386964
that yields same results
0
 
LVL 4

Expert Comment

by:sdussinger
ID: 6386969
How are you reading the first four bytes to know what the order should be?

--Steve
0
 

Author Comment

by:scooter1
ID: 6386975
hex editor
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 4

Expert Comment

by:sdussinger
ID: 6387012
It's been a while since I messed with this stuff...

The bytes are being read in correctly. I modified the code I posted earlier to display the bytes individually, and they come out in the order one would expect.

I created a dummy file with the letters ABCD in it, and read it using the code from earlier. When printing out the bytes individually I get 0x41 0x42 0x43 0x44. This is what I expected to see. When I print out the int using the %x I get 0x44434241. Backwards.

The problem must be then when the %x accesses the 4 bytes. It's performing the endian modifications and swapping the bytes.

Unfortunately, using a hex editor to look at the file doesn't always get you what you want. Because of the endian stuff, the order which the bytes are stored on disk may be different from the way they are interpreted as ints, etc.

The real question is what do you need to do with this information your read from the file? In order to compare the info read in from the file, you'll need to compare the bytes 1-by-1 or use some sort of endian-mapping macros to figure out the order of bytes. There is a whole set of macros in the GCC compiler to handle endian-ness, maybe they could help you.

Unfortunately, I'm a little rustier than I thought with this stuff. :-(

--Steve
0
 

Author Comment

by:scooter1
ID: 6387052
how did you print out the bytes individually?
0
 
LVL 5

Expert Comment

by:BlackDiamond
ID: 6387067
scooter,
In order to avoid the endian problems (as sdussinger pointed out this is), you need to avoid the built in data types.  These types will automatically be swapped in memory to the native endian order.  The way to avoid this is to allocate your own memory, and handle the information byte by byte. This way you are guaranteed to get the information in Big Endian, no matter which platform you are on.  Here is a modified example of your code..


   FILE *stream;
   int i;
   unsigned char * bytes;
   int mask = 0x00FF;

   if( (stream = fopen( "Box.exe", "rb" )) == NULL )
      printf( "Couldn't open file\n" );
   else
   {
      /* Read a word from the stream: */
      //i = _getw( stream );
     
      /* Create my own "integer" instead */
      bytes = malloc(sizeof(int));
      fread(bytes, sizeof(int), 1, stream);


      /* If there is an error... */
      if( ferror( stream ) )
      {
         printf( "_getw failed\n" );
         clearerr( stream );
      }
      else
      {
         printf( "First data word in file: ");
         /* Print the info byte by byte */
         printhex(bytes,sizeof(int));
         printf("\n");
      }
      fclose( stream );
      free(bytes);
   }
}

void
printhex(unsigned char * bytes, int length)
{
   int count;

   for (count = 0;count<length;count++)
   {
      printf("0x%x ",bytes[count]);
   }
}
0
 
LVL 5

Expert Comment

by:BlackDiamond
ID: 6387078
Just as a side note, you will now get the data in exactly the order it is stored, but if you need to do any data manipulation or calculations, you will need to be very careful of your byte ordering.

Good Luck,
BD
0
 
LVL 86

Expert Comment

by:jkr
ID: 6387838
You have to re-arrange the byte order using the appropriate function:


#include <winsock.h>

//...

int bytes = fread ((void *) &i, 1, sizeof (int), stream);

//...

/*
The Windows Sockets ntohl function converts a u_long from TCP/IP network order to host byte order (which is big-endian).
*/

i = ntohl ( i);
0
 

Author Comment

by:scooter1
ID: 6391122
So what i ended up doing is:

 FILE *stream;
 
   if( (stream = fopen( line, "rb" )) != NULL )
   {
       /* read in 1 character*/
       numread = fread( list, sizeof(unsigned char ), 1, stream );

        for(double x = 0; x < numbytes; x++)
        {
         fprint(stream2,"%.2x%c",list[0],' ');

          /* read in 1 character*/
         numread += fread( list, sizeof(unsigned char ), 1, stream );
    }
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

I use more than 1 computer in my office for various reasons. Multiple keyboards and mice take up more than just extra space, they make working a little more complicated. Using one mouse and keyboard for all of my computers makes life easier. This co…
In our personal lives, we have well-designed consumer apps to delight us and make even the most complex transactions simple. Many enterprise applications, however, are a bit behind the times. For an enterprise app to be successful in today's tech wo…
Viewers will learn how to use the Hootsuite Dashboard.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now