Solved

How to Recursively Search Drive for Specific File Types

Posted on 2004-04-05
12
1,968 Views
Last Modified: 2013-12-03
Hi,
I'm writing a utility program that will allow LAN Managers to remotely backup files on remote PC's as a preperatory step to the user either moving or receiving a new PC. Actually the backup is not really a backup, but a copy to a folder of their choice. The main functionality behind this piece is based upon some code I found I think on CodeGuru, or CodeProject. Anyway, it worked fine when I passed the method the folder to start searching in, along with the file type to search for. However, a common complaint was that the LAN Managers would need to run this multiple times to capture the different files. For instance, they may have .docs in ..\My Documents, and .xls files in \CurrentProjects. They would have to then check the "doc" extenstion check box and the populate the start search text box and run the program, and then check the "xls" check box and populate its start search text box. Well, one approach would have been to place a text box to start searching from next to each check box. However, I was trying to search from the root of the drive, and compare each file's extension to that being sought, and if there was a match, a copy would follow. Needless to say, I have had some major issues trying to get it to work.

I'm posting the most current version of what DOESN'T work. If any has some ideas regarding how I can recursively find file types starting from the root, I would appreciate it.

Thanks,
Jeff


BOOL CFileBackupDlg::CopyExtension(CString csRoot, CString csDest, CString csCriteria)
{
      BOOL                        bRet = FALSE;
      BOOL                        bRes = FALSE;
      CString                        csPathMask;
      CString                        csFullPath;
      CString                        csNewFullPath;
      CString                        csNewPath;
      CString                        csMsg;
      CString                        strFirst;
      CString                        strSecond;
      CString                        csBuf;
      CString                        csSub;
      CString                        csToMatch;
      DWORD                        dwRet = 0;
      WIN32_FIND_DATA            fd;
      HANDLE                        hFind;
      int                              nchar = 0;

      ////////////////////////////////////////////////////////////////
      csRoot += _T("\\");
      csDest += _T("\\");

      csNewPath = csDest + csCriteria.Mid(2);
      CreateDirectory (csNewPath, NULL);
      csPathMask = csRoot + _T("*.*");

      hFind = FindFirstFile (csPathMask, &fd);
      if (hFind == INVALID_HANDLE_VALUE)
      {
            MessageBox ("The file criteria did not return any matches. Ensure the folder selection is correct and try again.",
                              "File Find Error", MB_OK);
            return FALSE;
      }
      else
      {
            // strip the extension from the incoming criteria
            nchar = csCriteria.Find (_T("."));
            if (nchar != -1)
                  csToMatch = csCriteria.Mid (nchar + 1);
            // see if the file found is one we're looking for
            csBuf = fd.cFileName;
            //strip the extension from the file we found
            nchar = csBuf.Find (_T("."));
            if (nchar != -1)
            {
                  csSub = csBuf.Mid (nchar + 1);
                  if (csSub.CompareNoCase (csToMatch) == 0)
                  {
                        csFullPath = csRoot + fd.cFileName;
                        csNewFullPath = csDest + fd.cFileName;
                        CopyFile (csFullPath, csNewFullPath, FALSE);
                        strFirst = GetCRC (csFullPath, dwRet);
                        strSecond = GetCRC (csNewFullPath, dwRet);
                        if (strFirst != strSecond)
                        MessageBox ("File copy integrity check failed! The file may be corrupt or missing!",
                                          csFullPath, MB_OK);

                        csMsg = "Copying: " + csFullPath;
                        SetDlgItemText (IDC_STATUS, csMsg);

                  }
            }
            while (hFind && FindNextFile (hFind, &fd))
            {
                  //need to add checking to check for matching extensions

                  nchar = csCriteria.Find (_T("."));
                  if (nchar != -1)
                        csToMatch = csCriteria.Mid (nchar + 1);
                  // see if the file found is one we're looking for
                  csBuf = fd.cFileName;
                  //strip the extension from the file we found
                  nchar = csBuf.Find (_T("."));
                  if (nchar != -1)
                  {
                        csSub = csBuf.Mid (nchar + 1);
                        if (csSub.CompareNoCase (csToMatch) == 0)
                        {
                              csFullPath = csRoot + fd.cFileName;
                              csNewFullPath = csDest + fd.cFileName;
                              // see if it's a file or a folder
                              if (!(fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
                              {
                                    bRes = CopyFile (csFullPath, csNewFullPath, FALSE);
                                    strFirst = GetCRC (csFullPath, dwRet);
                                    strSecond = GetCRC (csNewFullPath, dwRet);
                                    if (strFirst != strSecond)
                                          MessageBox ("File copy integrity check failed! The file may be corrupt or missing!",
                                                csFullPath, MB_OK);
                                    if (!bRes)
                                    {
                                          bRet = FALSE;
                                    }
                                    csMsg = "Copying: " + csFullPath;
                                    SetDlgItemText (IDC_STATUS, csMsg);
                              }
                              else // probably a directory
                              {
                                    if ((_tcscmp (fd.cFileName, _T(".")) != 0) && 
                                          (_tcscmp (fd.cFileName, _T("..")) != 0))
                                    {
                        
                                          if (!CopyFolder (csFullPath, csNewFullPath, csCriteria))
                                          {
                                                bRet = FALSE;
                                          }
                                    }
                              }
                        }

                  }
                  else // probably a directory
                  {
                        if ((_tcscmp (fd.cFileName, _T(".")) != 0) && 
                                    (_tcscmp (fd.cFileName, _T("..")) != 0))
                        {
                              if (!CopyFolder (csFullPath, csNewFullPath, csCriteria))
                              {
                                    bRet = FALSE;
                              }
                        }
                  }

            }
      }
      SetDlgItemText (IDC_STATUS, "");
      FindClose (hFind);
      return bRet;
                        
}

0
Comment
Question by:jpetter
  • 4
  • 4
  • 2
  • +2
12 Comments
 
LVL 86

Expert Comment

by:jkr
ID: 10759782
Could you be a bit more specific about what exactly is not working?
0
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 10759814
I would create a string that contains all extensions you are interested in:
string extensions = ".doc.xls.pdf.txt";

Create this when the user clicks the "OK" button, after all the extensions that need to be backed up are selected. Then, in your look that searches for the files, after you extract the extension of the current file (stored in the variable currentExtension), do something like this:

if (extensions.find(currentExtension) != npos)
{
    // file needs to be backed up
}
else
{
    // file does not need to be backed up - you may not need this else branch
}
0
 

Author Comment

by:jpetter
ID: 10759894
Sure, it does iterate through the files in the root, and through the debugger (BTW, MS VC++ v 6) I can see it comparing the extenstions to the search criteria. When the current file is a folder, and I try to "step into" the recursive call, I can't for some reason. However, it bombs out at the beginning of the method where I check to see if I have a valid file handle returned from FindFirstFile. I wish I was more competent with the debugger, because I would like to see what is being passed for the values. Normally I have no problem, but when it's called recursively I can't see it.

Thanks,
Jeff
0
ScreenConnect 6.0 Free Trial

Explore all the enhancements in one game-changing release, ScreenConnect 6.0, based on partner feedback. New features include a redesigned UI, app configurations and chat acknowledgement to improve customer engagement!

 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 10759941
Looks like I misunderstood your question :-)

Can you please provide the CopyFolder() code as well.
0
 
LVL 44

Assisted Solution

by:Karl Heinz Kremer
Karl Heinz Kremer earned 135 total points
ID: 10759949
BTW: Usually when you put your mouse over a variable name in the debugger, you will get a tooltip with the current value.
0
 
LVL 86

Assisted Solution

by:jkr
jkr earned 135 total points
ID: 10759954
>> When the current file is a folder, and I try to "step into" the recursive call, I can't for some reason.

The reason seems to be that in

                        else // probably a directory
                        {
                             if ((_tcscmp (fd.cFileName, _T(".")) != 0) && 
                                   (_tcscmp (fd.cFileName, _T("..")) != 0))
                             {
                   
                                  if (!CopyFolder (csFullPath, csNewFullPath, csCriteria))
                                  {
                                       bRet = FALSE;
                                  }
                             }
                        }

you are

a) calling a different function ('CopyFolder()' instead of 'CopyExtension()'
b) not appending fd.cFileName to csFullPath before making the call
0
 

Author Comment

by:jpetter
ID: 10760038
jkr &  khkremer,

You guys have given me some great help. Let me play around with this a second.

One problem, as both of you caught. I modified my CopyFolder function to end up with CopyExtension, but never changed the call when I go to call it recursively. And that is probably why I don't append the csFullPath, because the variable names are different.

If this turns out to be the solution, what a total tool I'll feel like. How careless.

Thanks, and I'll get right back.
0
 

Author Comment

by:jpetter
ID: 10760612
Well, those eye openers that caught my oversights did allow me to get further. Now I just have to rework the function a little so that it doesn't die when I hit an empty folder.

Thanks,
Jeff
0
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 10761989
Where does it die in an empty folder?
0
 
LVL 12

Assisted Solution

by:Salte
Salte earned 55 total points
ID: 10763599
I think you might have problems due to the find file handle?

Note that each search creates a find handle and there is a point in minimizing those.

It might therefore NOT be a good idea to simply scan through each file and if it is a directory you recurse into it immediately and if it is a file you copy it.

If it is a file you can copy it alright but if it is a directory I suggest you just put it in a queue or stack and process that dir after you are done with current dir. If you use a queue the copy will be breadth first and if you use a stack it will be depth first.

So, I suggest you make a simple stack or queue, the queue can be as simple as a queue of strings holding the directory names (full path) of the directory to scan.

step 1 - get a stack class. STL has a vector that can be used as stack or you can make your own or - since I see you use MFC a lot - you can use MFC's array class as a stack.

step 2. Initialize the stack with the root directory you want to search.

step 3. Loop while the stack is non-empty and pop the top element from the stack and scan that directory. For each element in the scan if it is a file copy it - create directories as necessary to do the copy. If it is a directory push it unto the stack.

When the loop is done your copying is done.

Also, since you use an explicit stack you don't even have to recurse in your program, a simple loop is all:


stack.push(RootDir);
while (! stack.is_empty()) {
    CString dirname = stack.pop();
    // open a find handle and scan the dir here.
    // for each file entry check if it should be copied, scanned or ignored.
   while (FindNextFile(....)) {
        if (current file is dir) {
             stack.push(full_file_name); // May have to construct that name first.
       } else if (current file should be copied)  {
             copy_file(....);
       } // else it should be ignored.
   }
    close_search_handle();
}

Try this and see if it works better. The clue is to not have too many open search handles at the same time. This code uses only ONE search handle for all the directories since it closes the handle before it searches next directory.

Hope this is of help.

Alf
0
 
LVL 3

Accepted Solution

by:
akalmani earned 175 total points
ID: 10764013
//Win32 way no MFC usage....
void DoIt(LPCTSTR szDir)
{
   WIN32_FIND_DATA FileData;
   HANDLE hSearch = NULL;
   _TCHAR szPath[MAX_PATH] = _T("");     //MAX_PATH can be defined as 255

  _tcscpy(szPath, szDir);
  _tcscat(szPath, _T("*"));

   hSearch = FindFirstFile(szPath, &FileData);
   if(INVALID_HANDLE_VALUE != hSearch)
   {
        while(FindNextFile(hSearch, &FileData))
        {
           //Search for . and .. special files
          if((_tcsicmp(FileData.cFileName, _T(".")) == 0) ||
            (_tcsicmp(FileData.cFileName, _T("..")) == 0))
          {
            continue;
          }

          //Check if it is directory
          if(FILE_ATTRIBUTE_DIRECTORY == FileData.dwFileAttributes)
          {
             _tcscpy(szPath, szDir);
             _tcscat(szPath, FileData.cFileName);
             _tcscat(szPath, _T("\\"));

             DoIt(szPath);//Recursive call
          }
         else
         {
            //szDir will contain the path. Do your file copy after checking the extension here
         }
       }//End of while
     }//End of if(INVALID_HANDLE_VALUE != hSearch)
     
    //Close the search handle.
    FindClose(hSearch);
}


//Give credit to original author Nonubik. This uses MFC
void DoIt(LPCTSTR szDir)
{
  CFileFind   Finder;
  CString     strPath(szDir);
  strPath += "\\*";
  BOOL bFind = Finder.FindFile(strPath);
  while(bFind)
  {
    bFind = Finder.FindNextFile();
   
    // skip . and .. files; otherwise, we'd
    // recur infinitely!
    if(Finder.IsDots())
         continue;

    // if it's a directory, recursively search it
    if (Finder.IsDirectory())
      DoIt(Finder.GetFilePath());
    else
      //Use Finder.GetFilePath() to get the path. Copy file after extension check here.
  }
}

//Refer to recurse a directory
http://www.experts-exchange.com/Programming/Programming_Languages/Cplusplus/Q_20935224.html
0
 

Author Comment

by:jpetter
ID: 10768073
I can't thank all of you enough for all of your help. The support I've received on this has been terrific!

I'll split up the points, and wish I had more to go around.

Thanks again,
Jeff
0

Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Templates For Beginners Or How To Encourage The Compiler To Work For You Introduction This tutorial is targeted at the reader who is, perhaps, familiar with the basics of C++ but would prefer a little slower introduction to the more ad…
Introduction This article is a continuation of the C/C++ Visual Studio Express debugger series. Part 1 provided a quick start guide in using the debugger. Part 2 focused on additional topics in breakpoints. As your assignments become a little more …
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question