Solved

Searching byte sequence in file. Problem with certain byte values

Posted on 2007-04-01
6
247 Views
Last Modified: 2010-05-18
Hello,

I'm coding a little utility that searches for a given byte sequence in a file. The source code is as follows:
#include "stdafx.h"
#include <fstream>
#include <iostream>
using namespace std;


int _tmain(int argc, _TCHAR* argv[])
{
      char readChar;
      unsigned char searchPattern[] = { 0xEB, 0x34, 0x54, 0x44 };
      fstream fileReader;
      fileReader.open("d:\\Development\\TestData\\test.txt", ios::in | ios::binary);
      
      cout << "Searching for pattern: ";
      for(int i = 0; i < sizeof(searchPattern); i++){
            printf("0x%x ", searchPattern[i]);
            
      }
      cout << endl;

      if(fileReader.is_open()){
            cout << "File opened" << endl;
            while(!fileReader.eof()){
                  fileReader.read(&readChar, sizeof(readChar));
                  printf("Current byte: 0x%x\n", readChar);

                  /*First byte of search pattern is found. Seek pointer in file is
                   *adjusted by -1. sizeof(searchPattern) bytes are read from that
                   *position and then compared to the searchPattern. If there is a
                   *match, the file-offset is saved. If no match, seek pointer is
                   *adjusted again to continue search
                  */
                  if(readChar == searchPattern[0]){
                        
                        char compareBuffer[sizeof(searchPattern)];
                        fileReader.seekg((int)fileReader.tellg()-1);                        
                        fileReader.read(compareBuffer, sizeof(compareBuffer));
                        bool match = true;
                        for(int i = 0; i < sizeof(compareBuffer); i++){
                              match = compareBuffer[i] == searchPattern[i];
                              if(!match){
                                    break;
                              }
                              printf("Byte match found: 0x%x for searchPattern byte %d\n", compareBuffer[i], i+1);
                        }
                        if(!match){
                              //No match, adjust file pointer
                              fileReader.seekg((int)fileReader.tellg() - sizeof(searchPattern) + 1);
                              cout << "No match, offset adjusted to: " << (int)fileReader.tellg()<< endl;

                        } else {
                              //Save file offset of match
                              cout << "Offset of first byte for match: " << (int)fileReader.tellg() - sizeof(searchPattern) << endl;

                        }
                  }
            }
      } else {
            cout << "Error opening file";
      }
      fileReader.close();
      return 0;
}

It works fine when searching for numbers and letters. But as soon as i put something like 0xEB into the search pattern, the pattern wont be found anymore, even the pattern is in my test file. Here is the output i get when running the tool on my test-file which includes the 0xEB byte:

Searching for pattern: 0xeb 0x34 0x54 0x4
File opened
Current byte: 0x67
Current byte: 0xffffffeb
Current byte: 0x38
Current byte: 0x37
Current byte: 0x32
Current byte: 0x38
Current byte: 0x37
Current byte: 0x36
Current byte: 0x38
Current byte: 0x36
Current byte: 0x38
Current byte: 0x37
Current byte: 0x67
Current byte: 0x66
Current byte: 0x68
Current byte: 0x67
Current byte: 0x66
Current byte: 0x68
Current byte: 0x67
Current byte: 0x6b
Current byte: 0x67
Current byte: 0x67
Current byte: 0x38
Current byte: 0xffffffeb
Current byte: 0xffffffeb

The problem seems to be that for the 0xEB bytes there are FFFFFF added. I dont know why this happens, can someone help me with this?
0
Comment
Question by:b3n_
  • 3
  • 2
6 Comments
 
LVL 11

Accepted Solution

by:
DeepuAbrahamK earned 30 total points
ID: 18834631
It 0xffffffff shows because it shows the full 4 bytes.
(Hex)0xffffffff = 11111111111111111111111111111111 (in binary)

Since you are using int, it will show full 4 bytes, make it to 'short i' or downcast it to  (short)

Best Regards,
DeepuAbrahamK
0
 
LVL 53

Assisted Solution

by:Infinity08
Infinity08 earned 20 total points
ID: 18834684
Try using unsigned char instead of char.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 18834713
Did DeepuAbrahamK's post help you ?

Changing i from int to short will not change anything for your problem as far as I can see ...

And short is usually 2 bytes, not 1 byte.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:b3n_
ID: 18834723
it helped because he pointed out the general problem. i gave you points too cause your solution helped me fix the program. maybe i should have given you some more points, sorry for that.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 18834787
No, that's not the reason I asked - I just wanted to make sure that you knew that using "short" will not fix the problem.

I wasn't critiquing the way you allocated points, because that's your decision - you alone know which posts helped you :)
0
 
LVL 11

Expert Comment

by:DeepuAbrahamK
ID: 18834932
b3n, I have overlooked that, I think you got my intension correctly.. Thanks
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

When writing generic code, using template meta-programming techniques, it is sometimes useful to know if a type is convertible to another type. A good example of when this might be is if you are writing diagnostic instrumentation for code to generat…
In days of old, returning something by value from a function in C++ was necessarily avoided because it would, invariably, involve one or even two copies of the object being created and potentially costly calls to a copy-constructor and destructor. A…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now