Recursive file and folder audit... T-SQL

Starting with the following files in drive C:

C:
  Folder_1
               FolderA
                      a.xls
                      b.xls
             c.xls  <-- this file is in Folder_1
 Folder_2
            d.xls
            e.xls

Note: There could be many levels of subfolders. I am trying to read file names and folder names to enter them in a table like:

tblFIleAndFolders
=============================
ID_PK           Name               ID_FK            FileYN
1                C:                      null
     2          Folder_1                  1    <-- meaning it is in C:
     3          c.xls                         2                    Yes
     4          FolderA                   2
     5          a.xls                        4                    Yes
     6          b.xls                        4                    Yes
     7         Folder_2                  1
     8          d.xls                        7                    Yes
     9          e.xls                        7                    Yes

Question: Could this be done recursively to include all .xls files, folders, and subfolders (regardless it has .xls or not)?

Order of entries is not important. I will, at some point, expand this to include .doc, .png, etc.

There is an undocumented PROC which does something similar. I wonder if it could be modified to list select files and folders, .xls for example which we can pass as a parameter, in this table as shown above. This proc is discussed at: http://www.kodyaz.com/articles/file-folder-list-xp_dirtree-sql-server-stored-procedure-recursive-cte.aspx
LVL 34
Mike EghtebasDatabase and Application DeveloperAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Phillip BurtonDirector, Practice Manager and Computing ConsultantCommented:
Yes, you can.

You can use a recursive CTE to implement this.

However, I'm not quite clear - are you saying that the second example is what you have, and the first is what you want?

Now I'm looking at your link, the last example does use a recursive CTE to solve this.

The only thing which is missing is the spacing, and REPLICATE('. ', depth-1) would achieve that.
0
arnoldCommented:
Must it b e tsql, running from sql, you run into the possibility that the credentials with which mssql runs does not have rights on the path. using vbscript, powershell, c# might be simpler to navigate through the filesystem while building/updating the data in the sql.
0
Mike EghtebasDatabase and Application DeveloperAuthor Commented:
Must it b e tsql
No.

run into ... credentials
Personal computer, windows authenticated.

using vbscript, powershell
do not know much about either.

c# might be simpler
c# solution will be much better to get started.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

arnoldCommented:
MSDN has the tutorial on enumerating drives in a system, directories within a drive, files within a directory.

http://msdn.microsoft.com/en-us/library/6yk7a1b0.aspx
Includes references to how to read input data from a file.

C# interaction with a ms sql database is also covered within MSDN
USing the System.Data.SqlClient;
Double check which .Net framework is installed on your (you could test for the installed, and adjust accordingly)
http://msdn.microsoft.com/en-us/library/System.Data.SqlClient%28v=vs.100%29.aspx
the locating of drives in a system, traversing and locating directories, files is part of the attributes. When you find a file, you can then check its extension ....
You could use recurisve function that will initially receive the Drive letter, then will enumerate the contents, when a directory is located, it will call itself with the new path

Process_recursice (path, ID)
i.e. if you have three partition C (1), D (2), E (3), and you start (0)
Fk suggests that you have multiple tables, i.e. drive table, directory table, and file table?
directory table will have a FK from the drive table and a one to many relationship to the file table.
You would want to break them up such that you can query the data using a join versus running a sequential query on the same table. once a file is located, to locate the directories above it, and then the drive
file a is in directory A which is in directory 1, which is within directory,
The directory table will be queried multiple times to collect the chain.

c: (1)
   folder 1
              folder I
                          folder i
                                     file1

You should consider what it is you are using it for in an attempt to optimize the retrieval of data.
0
DcpKingCommented:
Just a thought: you can use the command line dir command to generate a file recursively through the file system, and parse that...
0
Mike EghtebasDatabase and Application DeveloperAuthor Commented:
DcpKing,

Do you have a sample?
0
Scott PletcherSenior DBACommented:
You can use MS's undocumented stored proc:
xp_dirtree

It has three parameters:
1.  directory name
       -- such as 'C:\'
2.  depth
        --how many subfolder levels below the starting directory to display.  0 = list all subfolders
3.  flag to list only folders or (files and folders)
        --0=list only folders;1=list files and folders.

So, for your request:

IF OBJECT_ID('tempdb.dbo.#xp_dirtree') IS NOT NULL
    DROP TABLE dbo.#xp_dirtree
CREATE TABLE #xp_dirtree (
    ident int identity(1, 1) not null,
    subdirectory nvarchar(255) null,
    depth smallint null,
    is_file bit null
    )

INSERT INTO #xp_dirtree ( subdirectory, depth, is_file )
EXEC master.sys.xp_dirtree 'C:', 0, 1

SELECT *
FROM #xp_dirtree
--WHERE is_file = 1 --list only files, not directories
--ORDER BY ...
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
DcpKingCommented:
As an example of using the dir command, open a command prompt and you'll be at something like c:\users\eghtebas.
Enter "CD .." (without the quotes!) and press enter. Do this again and you should be at the root, C:\.
On my system I then entered
dir /?
to get some help. Once I had determined what I wanted I entered the following command (/S means "recurse through all subdirectories"; /B means "use bare format", so it skips all introduction and summary lines)

C:\>dir /S /B *.xls > results.txt

Open in new window


Then, when it was finished, I opened the new file called results.txt and found this:
C:\SSISMaster\Class Instructor Demo Schedule.xlsx
C:\SSISMaster\Data Files\Excel Example.xlsx
C:\SSISMaster\Data Files\USCustomers.xls
C:\Users\mai\Desktop\SSISMaster\Class Instructor Demo Schedule.xlsx
C:\Users\mai\Desktop\SSISMaster\Data Files\Excel Example.xlsx
C:\Users\mai\Desktop\SSISMaster\Data Files\USCustomers.xls
C:\Users\mai\Documents\Visual Studio 2010\Projects\2007Fint.xls
C:\Users\mai\Dropbox\Tola\patterns\Scarves and Shawls\PIShawl.xls

Open in new window


I can read these in to a SQL Server staging table very easily and then create whatever I want for data from them.
-------------------------------------------------
So, how to get this done.

You're using SQL Server 2005 or greater, so create a batch file to do your work. The file might read:
 
[embed=file 885772]del results.txt
dir /S /B c:\*.xls > results.txt

Open in new window


Then go to SSIS (it comes with your installation) and create a new project. Use an ExecuteProcess Task to run the batch file and then a flat file import to get the contents of the resulting file into the records of a table (see attached screengrab).

hth

Mike
ExecuteProcessTask.png
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.