Solved

Tell if a file is a EXE in disguise

Posted on 2004-04-15
14
641 Views
Last Modified: 2008-02-26
Lets say a naughty user was keeping a unlicensed copy of winzip on his machine and whenever he doesn't use it he renames it from winzip.exe to db1.mdb, and then renames it to winzip.exe again whenever he wants to use it.

Is there any (fast) way of opening db1.mdb with CreateFile and looking at some of the bytes inside to tell if the file is really an EXE ?

Bear in mind that I would also like to inspect EXE's that don't carry versioninfo structures, if poss.

thanks
0
Comment
Question by:plq
  • 5
  • 4
  • 3
  • +2
14 Comments
 
LVL 17

Expert Comment

by:rstaveley
ID: 10834055
Look for the first two bytes and see if they are MZ. That's a good indicator.
0
 
LVL 14

Assisted Solution

by:wayside
wayside earned 200 total points
ID: 10834069
The first two bytes of the file contain a "magic number" that tells you what kind of file it is.

For a windows exe, the first two bytes are generally the ascii characters 'M' and 'Z' . (I think programs compiled with Borland compilers use 'M' and something else, I will see if I can did this up.

For unix there will be something similar.

So you can just read the first two bytes to figure it out.

If you want to probe deeper, you can extract out the PE structures from the first few hundred bytes and look for some other things like the 'P' and 'E' and two nulls in a row, which mark the start of the 32 bit portion of the exe.

If you want to go this far, let me know, I can help you with that.
0
 
LVL 13

Expert Comment

by:SteH
ID: 10834105
The first two bytes are 4D 5A hex and starting at offset 0x04F there is a stub message which is presented in DOS mode like
This program cannot be run in DOS mode
This program must be run under Win32

Take a hex editor and open some files of your interest and have a look at them
0
 
LVL 86

Expert Comment

by:jkr
ID: 10834314
If you have VC++, 'dumpbin <filename>' will give you the appropriate information. See also PEViewer (http://www.magma.ca/~wjr).
0
 
LVL 14

Expert Comment

by:wayside
ID: 10834413
A little bit more than just checking the first two bytes might be:

1) Read first two bytes, check for 'M' and 'Z'; if not there you are done, it's not an exe
2) read the next 58 bytes and throw out
3) read the next 4 bytes as a long (32 bit integer). This is the file offset to the PE header, and will always be less than 1024 (I've never seen a bigger one anyway)
4) read the next (fileoffset - 64 + 4) bytes. If the last 4 bytes read are 'P', 'E', '\0, '\0, the it is an executable.

So the most you will read is a few hundred bytes.

0
 
LVL 86

Expert Comment

by:jkr
ID: 10834483
>>A little bit more than just checking the first two bytes might be

Thet's pretty much what PEViewer pr dumpbin does - analyzing the PE header.

BTW, there is even an API:

SHFILEINFO sfi;
if ( !SHGetFileInfo ( "c:\\path\\file.ext", 0, &sfi, sizeof ( sfi), SHGFI_EXETYPE) {

    // no executable file

} else {

    // executable file
}
0
 
LVL 8

Author Comment

by:plq
ID: 10834493
wayside, thanks very much for this.

I'll keep the question open for a couple of days while I play around with these ideas

thanks also to everyone else who answered
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 86

Expert Comment

by:jkr
ID: 10834508
Ooops, that should of course be

SHFILEINFO sfi;
if ( !SHGetFileInfo ( "c:\\path\\file.ext", 0, &sfi, sizeof ( sfi), SHGFI_EXETYPE)) {

   // no executable file

} else {

   // executable file
}
0
 
LVL 8

Author Comment

by:plq
ID: 10834549
thanks jkr, that might actually be a better option, although waysides option might be faster if its reliable/consistent.

I'll test both.
0
 
LVL 86

Accepted Solution

by:
jkr earned 300 total points
ID: 10834618
>>if its reliable/consistent

See http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndebug/html/msdn_peeringpe.asp ("Peering Inside the PE: A Tour of the Win32 Portable Executable File Format") - it comes wtih sample code:

//--------------------
// PROGRAM: PEDUMP
// FILE:    PEDUMP.C
// AUTHOR:  Matt Pietrek - 1993
//--------------------
#include <windows.h>
#include <stdio.h>
#include "objdump.h"
#include "exedump.h"
#include "extrnvar.h"

// Global variables set here, and used in EXEDUMP.C and OBJDUMP.C
BOOL fShowRelocations = FALSE;
BOOL fShowRawSectionData = FALSE;
BOOL fShowSymbolTable = FALSE;
BOOL fShowLineNumbers = FALSE;

char HelpText[] =
"PEDUMP - Win32/COFF .EXE/.OBJ file dumper - 1993 Matt Pietrek\n\n"
"Syntax: PEDUMP [switches] filename\n\n"
"  /A    include everything in dump\n"
"  /H    include hex dump of sections\n"
"  /L    include line number information\n"
"  /R    show base relocations\n"
"  /S    show symbol table\n";

// Open up a file, memory map it, and call the appropriate dumping routine
void DumpFile(LPSTR filename)
{
    HANDLE hFile;
    HANDLE hFileMapping;
    LPVOID lpFileBase;
    PIMAGE_DOS_HEADER dosHeader;
   
    hFile = CreateFile(filename, GENERIC_READ, FILE_SHARE_READ, NULL,
                        OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
                   
    if ( hFile = = INVALID_HANDLE_VALUE )
    {   printf("Couldn't open file with CreateFile()\n");
        return; }
   
    hFileMapping = CreateFileMapping(hFile, NULL, PAGE_READONLY, 0, 0, NULL);
    if ( hFileMapping = = 0 )
    {   CloseHandle(hFile);
        printf("Couldn't open file mapping with CreateFileMapping()\n");
        return; }
   
    lpFileBase = MapViewOfFile(hFileMapping, FILE_MAP_READ, 0, 0, 0);
    if ( lpFileBase = = 0 )
    {
        CloseHandle(hFileMapping);
        CloseHandle(hFile);
        printf("Couldn't map view of file with MapViewOfFile()\n");
        return;
    }

    printf("Dump of file %s\n\n", filename);
   
    dosHeader = (PIMAGE_DOS_HEADER)lpFileBase;
    if ( dosHeader->e_magic = = IMAGE_DOS_SIGNATURE )
       { DumpExeFile( dosHeader ); }
    else if ( (dosHeader->e_magic = = 0x014C)    // Does it look like a i386
              && (dosHeader->e_sp = = 0) )        // COFF OBJ file???
    {
        // The two tests above aren't what they look like.  They're
        // really checking for IMAGE_FILE_HEADER.Machine = = i386 (0x14C)
        // and IMAGE_FILE_HEADER.SizeOfOptionalHeader = = 0;
        DumpObjFile( (PIMAGE_FILE_HEADER)lpFileBase );
    }
    else
        printf("unrecognized file format\n");
    UnmapViewOfFile(lpFileBase);
    CloseHandle(hFileMapping);
    CloseHandle(hFile);
}

// process all the command line arguments and return a pointer to
// the filename argument.
PSTR ProcessCommandLine(int argc, char *argv[])
{
    int i;
   
    for ( i=1; i < argc; i++ )
    {
        strupr(argv[i]);
       
        // Is it a switch character?
        if ( (argv[i][0] = = '-') || (argv[i][0] = = '/') )
        {
            if ( argv[i][1] = = 'A' )
            {   fShowRelocations = TRUE;
                fShowRawSectionData = TRUE;
                fShowSymbolTable = TRUE;
                fShowLineNumbers = TRUE; }
            else if ( argv[i][1] = = 'H' )
                fShowRawSectionData = TRUE;
            else if ( argv[i][1] = = 'L' )
                fShowLineNumbers = TRUE;
            else if ( argv[i][1] = = 'R' )
                fShowRelocations = TRUE;
            else if ( argv[i][1] = = 'S' )
                fShowSymbolTable = TRUE;
        }
        else    // Not a switch character.  Must be the filename
        {   return argv[i]; }
    }
}

int main(int argc, char *argv[])
{
    PSTR filename;
   
    if ( argc = = 1 )
    {   printf(    HelpText );
        return 1; }
   
    filename = ProcessCommandLine(argc, argv);
    if ( filename )
        DumpFile( filename );
    return 0;
}
0
 
LVL 8

Author Comment

by:plq
ID: 10837302
wayside - let me know on the borland if you can. I'm gonna split points 50 50.
0
 
LVL 8

Author Comment

by:plq
ID: 10837363
I'd also like to know if other windows compilers like gnu or whatever use different codes to MZ.

jkr - whats your gut feeling on whether the above code would be slower than a simple read ? given that we're going to be scanning the whole hard disk under a low priority thread, performance will be critical.




0
 
LVL 14

Expert Comment

by:wayside
ID: 10837506
Looks like Borland usually starts with 'MZP', so you are covered by just checking for 'MZ'.


0
 
LVL 8

Author Comment

by:plq
ID: 10867737
Ahhh EE C++ is just the best TA in the world

thanks to all
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

What is C++ STL?: STL stands for Standard Template Library and is a part of standard C++ libraries. It contains many useful data structures (containers) and algorithms, which can spare you a lot of the time. Today we will look at the STL Vector. …
This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects. A brief on problem: Lets take example problem for simplicity: - I have a G…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now