?
Solved

How to Recursively Search Drive for Specific File Types

Posted on 2004-04-05
12
Medium Priority
?
1,997 Views
Last Modified: 2013-12-03
Hi,
I'm writing a utility program that will allow LAN Managers to remotely backup files on remote PC's as a preperatory step to the user either moving or receiving a new PC. Actually the backup is not really a backup, but a copy to a folder of their choice. The main functionality behind this piece is based upon some code I found I think on CodeGuru, or CodeProject. Anyway, it worked fine when I passed the method the folder to start searching in, along with the file type to search for. However, a common complaint was that the LAN Managers would need to run this multiple times to capture the different files. For instance, they may have .docs in ..\My Documents, and .xls files in \CurrentProjects. They would have to then check the "doc" extenstion check box and the populate the start search text box and run the program, and then check the "xls" check box and populate its start search text box. Well, one approach would have been to place a text box to start searching from next to each check box. However, I was trying to search from the root of the drive, and compare each file's extension to that being sought, and if there was a match, a copy would follow. Needless to say, I have had some major issues trying to get it to work.

I'm posting the most current version of what DOESN'T work. If any has some ideas regarding how I can recursively find file types starting from the root, I would appreciate it.

Thanks,
Jeff


BOOL CFileBackupDlg::CopyExtension(CString csRoot, CString csDest, CString csCriteria)
{
      BOOL                        bRet = FALSE;
      BOOL                        bRes = FALSE;
      CString                        csPathMask;
      CString                        csFullPath;
      CString                        csNewFullPath;
      CString                        csNewPath;
      CString                        csMsg;
      CString                        strFirst;
      CString                        strSecond;
      CString                        csBuf;
      CString                        csSub;
      CString                        csToMatch;
      DWORD                        dwRet = 0;
      WIN32_FIND_DATA            fd;
      HANDLE                        hFind;
      int                              nchar = 0;

      ////////////////////////////////////////////////////////////////
      csRoot += _T("\\");
      csDest += _T("\\");

      csNewPath = csDest + csCriteria.Mid(2);
      CreateDirectory (csNewPath, NULL);
      csPathMask = csRoot + _T("*.*");

      hFind = FindFirstFile (csPathMask, &fd);
      if (hFind == INVALID_HANDLE_VALUE)
      {
            MessageBox ("The file criteria did not return any matches. Ensure the folder selection is correct and try again.",
                              "File Find Error", MB_OK);
            return FALSE;
      }
      else
      {
            // strip the extension from the incoming criteria
            nchar = csCriteria.Find (_T("."));
            if (nchar != -1)
                  csToMatch = csCriteria.Mid (nchar + 1);
            // see if the file found is one we're looking for
            csBuf = fd.cFileName;
            //strip the extension from the file we found
            nchar = csBuf.Find (_T("."));
            if (nchar != -1)
            {
                  csSub = csBuf.Mid (nchar + 1);
                  if (csSub.CompareNoCase (csToMatch) == 0)
                  {
                        csFullPath = csRoot + fd.cFileName;
                        csNewFullPath = csDest + fd.cFileName;
                        CopyFile (csFullPath, csNewFullPath, FALSE);
                        strFirst = GetCRC (csFullPath, dwRet);
                        strSecond = GetCRC (csNewFullPath, dwRet);
                        if (strFirst != strSecond)
                        MessageBox ("File copy integrity check failed! The file may be corrupt or missing!",
                                          csFullPath, MB_OK);

                        csMsg = "Copying: " + csFullPath;
                        SetDlgItemText (IDC_STATUS, csMsg);

                  }
            }
            while (hFind && FindNextFile (hFind, &fd))
            {
                  //need to add checking to check for matching extensions

                  nchar = csCriteria.Find (_T("."));
                  if (nchar != -1)
                        csToMatch = csCriteria.Mid (nchar + 1);
                  // see if the file found is one we're looking for
                  csBuf = fd.cFileName;
                  //strip the extension from the file we found
                  nchar = csBuf.Find (_T("."));
                  if (nchar != -1)
                  {
                        csSub = csBuf.Mid (nchar + 1);
                        if (csSub.CompareNoCase (csToMatch) == 0)
                        {
                              csFullPath = csRoot + fd.cFileName;
                              csNewFullPath = csDest + fd.cFileName;
                              // see if it's a file or a folder
                              if (!(fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
                              {
                                    bRes = CopyFile (csFullPath, csNewFullPath, FALSE);
                                    strFirst = GetCRC (csFullPath, dwRet);
                                    strSecond = GetCRC (csNewFullPath, dwRet);
                                    if (strFirst != strSecond)
                                          MessageBox ("File copy integrity check failed! The file may be corrupt or missing!",
                                                csFullPath, MB_OK);
                                    if (!bRes)
                                    {
                                          bRet = FALSE;
                                    }
                                    csMsg = "Copying: " + csFullPath;
                                    SetDlgItemText (IDC_STATUS, csMsg);
                              }
                              else // probably a directory
                              {
                                    if ((_tcscmp (fd.cFileName, _T(".")) != 0) && 
                                          (_tcscmp (fd.cFileName, _T("..")) != 0))
                                    {
                        
                                          if (!CopyFolder (csFullPath, csNewFullPath, csCriteria))
                                          {
                                                bRet = FALSE;
                                          }
                                    }
                              }
                        }

                  }
                  else // probably a directory
                  {
                        if ((_tcscmp (fd.cFileName, _T(".")) != 0) && 
                                    (_tcscmp (fd.cFileName, _T("..")) != 0))
                        {
                              if (!CopyFolder (csFullPath, csNewFullPath, csCriteria))
                              {
                                    bRet = FALSE;
                              }
                        }
                  }

            }
      }
      SetDlgItemText (IDC_STATUS, "");
      FindClose (hFind);
      return bRet;
                        
}

0
Comment
Question by:jpetter
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 2
  • +2
12 Comments
 
LVL 86

Expert Comment

by:jkr
ID: 10759782
Could you be a bit more specific about what exactly is not working?
0
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 10759814
I would create a string that contains all extensions you are interested in:
string extensions = ".doc.xls.pdf.txt";

Create this when the user clicks the "OK" button, after all the extensions that need to be backed up are selected. Then, in your look that searches for the files, after you extract the extension of the current file (stored in the variable currentExtension), do something like this:

if (extensions.find(currentExtension) != npos)
{
    // file needs to be backed up
}
else
{
    // file does not need to be backed up - you may not need this else branch
}
0
 

Author Comment

by:jpetter
ID: 10759894
Sure, it does iterate through the files in the root, and through the debugger (BTW, MS VC++ v 6) I can see it comparing the extenstions to the search criteria. When the current file is a folder, and I try to "step into" the recursive call, I can't for some reason. However, it bombs out at the beginning of the method where I check to see if I have a valid file handle returned from FindFirstFile. I wish I was more competent with the debugger, because I would like to see what is being passed for the values. Normally I have no problem, but when it's called recursively I can't see it.

Thanks,
Jeff
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 10759941
Looks like I misunderstood your question :-)

Can you please provide the CopyFolder() code as well.
0
 
LVL 44

Assisted Solution

by:Karl Heinz Kremer
Karl Heinz Kremer earned 540 total points
ID: 10759949
BTW: Usually when you put your mouse over a variable name in the debugger, you will get a tooltip with the current value.
0
 
LVL 86

Assisted Solution

by:jkr
jkr earned 540 total points
ID: 10759954
>> When the current file is a folder, and I try to "step into" the recursive call, I can't for some reason.

The reason seems to be that in

                        else // probably a directory
                        {
                             if ((_tcscmp (fd.cFileName, _T(".")) != 0) && 
                                   (_tcscmp (fd.cFileName, _T("..")) != 0))
                             {
                   
                                  if (!CopyFolder (csFullPath, csNewFullPath, csCriteria))
                                  {
                                       bRet = FALSE;
                                  }
                             }
                        }

you are

a) calling a different function ('CopyFolder()' instead of 'CopyExtension()'
b) not appending fd.cFileName to csFullPath before making the call
0
 

Author Comment

by:jpetter
ID: 10760038
jkr &  khkremer,

You guys have given me some great help. Let me play around with this a second.

One problem, as both of you caught. I modified my CopyFolder function to end up with CopyExtension, but never changed the call when I go to call it recursively. And that is probably why I don't append the csFullPath, because the variable names are different.

If this turns out to be the solution, what a total tool I'll feel like. How careless.

Thanks, and I'll get right back.
0
 

Author Comment

by:jpetter
ID: 10760612
Well, those eye openers that caught my oversights did allow me to get further. Now I just have to rework the function a little so that it doesn't die when I hit an empty folder.

Thanks,
Jeff
0
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 10761989
Where does it die in an empty folder?
0
 
LVL 12

Assisted Solution

by:Salte
Salte earned 220 total points
ID: 10763599
I think you might have problems due to the find file handle?

Note that each search creates a find handle and there is a point in minimizing those.

It might therefore NOT be a good idea to simply scan through each file and if it is a directory you recurse into it immediately and if it is a file you copy it.

If it is a file you can copy it alright but if it is a directory I suggest you just put it in a queue or stack and process that dir after you are done with current dir. If you use a queue the copy will be breadth first and if you use a stack it will be depth first.

So, I suggest you make a simple stack or queue, the queue can be as simple as a queue of strings holding the directory names (full path) of the directory to scan.

step 1 - get a stack class. STL has a vector that can be used as stack or you can make your own or - since I see you use MFC a lot - you can use MFC's array class as a stack.

step 2. Initialize the stack with the root directory you want to search.

step 3. Loop while the stack is non-empty and pop the top element from the stack and scan that directory. For each element in the scan if it is a file copy it - create directories as necessary to do the copy. If it is a directory push it unto the stack.

When the loop is done your copying is done.

Also, since you use an explicit stack you don't even have to recurse in your program, a simple loop is all:


stack.push(RootDir);
while (! stack.is_empty()) {
    CString dirname = stack.pop();
    // open a find handle and scan the dir here.
    // for each file entry check if it should be copied, scanned or ignored.
   while (FindNextFile(....)) {
        if (current file is dir) {
             stack.push(full_file_name); // May have to construct that name first.
       } else if (current file should be copied)  {
             copy_file(....);
       } // else it should be ignored.
   }
    close_search_handle();
}

Try this and see if it works better. The clue is to not have too many open search handles at the same time. This code uses only ONE search handle for all the directories since it closes the handle before it searches next directory.

Hope this is of help.

Alf
0
 
LVL 3

Accepted Solution

by:
akalmani earned 700 total points
ID: 10764013
//Win32 way no MFC usage....
void DoIt(LPCTSTR szDir)
{
   WIN32_FIND_DATA FileData;
   HANDLE hSearch = NULL;
   _TCHAR szPath[MAX_PATH] = _T("");     //MAX_PATH can be defined as 255

  _tcscpy(szPath, szDir);
  _tcscat(szPath, _T("*"));

   hSearch = FindFirstFile(szPath, &FileData);
   if(INVALID_HANDLE_VALUE != hSearch)
   {
        while(FindNextFile(hSearch, &FileData))
        {
           //Search for . and .. special files
          if((_tcsicmp(FileData.cFileName, _T(".")) == 0) ||
            (_tcsicmp(FileData.cFileName, _T("..")) == 0))
          {
            continue;
          }

          //Check if it is directory
          if(FILE_ATTRIBUTE_DIRECTORY == FileData.dwFileAttributes)
          {
             _tcscpy(szPath, szDir);
             _tcscat(szPath, FileData.cFileName);
             _tcscat(szPath, _T("\\"));

             DoIt(szPath);//Recursive call
          }
         else
         {
            //szDir will contain the path. Do your file copy after checking the extension here
         }
       }//End of while
     }//End of if(INVALID_HANDLE_VALUE != hSearch)
     
    //Close the search handle.
    FindClose(hSearch);
}


//Give credit to original author Nonubik. This uses MFC
void DoIt(LPCTSTR szDir)
{
  CFileFind   Finder;
  CString     strPath(szDir);
  strPath += "\\*";
  BOOL bFind = Finder.FindFile(strPath);
  while(bFind)
  {
    bFind = Finder.FindNextFile();
   
    // skip . and .. files; otherwise, we'd
    // recur infinitely!
    if(Finder.IsDots())
         continue;

    // if it's a directory, recursively search it
    if (Finder.IsDirectory())
      DoIt(Finder.GetFilePath());
    else
      //Use Finder.GetFilePath() to get the path. Copy file after extension check here.
  }
}

//Refer to recurse a directory
http://www.experts-exchange.com/Programming/Programming_Languages/Cplusplus/Q_20935224.html
0
 

Author Comment

by:jpetter
ID: 10768073
I can't thank all of you enough for all of your help. The support I've received on this has been terrific!

I'll split up the points, and wish I had more to go around.

Thanks again,
Jeff
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

What is C++ STL?: STL stands for Standard Template Library and is a part of standard C++ libraries. It contains many useful data structures (containers) and algorithms, which can spare you a lot of the time. Today we will look at the STL Vector. …
Ever visit a website where you spotted a really cool looking Font, yet couldn't figure out which font family it belonged to, or how to get a copy of it for your own use? This article explains the process of doing exactly that, as well as showing how…
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question