Solved

Recursive file and folder audit... T-SQL

Posted on 2014-11-29
8
678 Views
Last Modified: 2014-12-09
Starting with the following files in drive C:

C:
  Folder_1
               FolderA
                      a.xls
                      b.xls
             c.xls  <-- this file is in Folder_1
 Folder_2
            d.xls
            e.xls

Note: There could be many levels of subfolders. I am trying to read file names and folder names to enter them in a table like:

tblFIleAndFolders
=============================
ID_PK           Name               ID_FK            FileYN
1                C:                      null
     2          Folder_1                  1    <-- meaning it is in C:
     3          c.xls                         2                    Yes
     4          FolderA                   2
     5          a.xls                        4                    Yes
     6          b.xls                        4                    Yes
     7         Folder_2                  1
     8          d.xls                        7                    Yes
     9          e.xls                        7                    Yes

Question: Could this be done recursively to include all .xls files, folders, and subfolders (regardless it has .xls or not)?

Order of entries is not important. I will, at some point, expand this to include .doc, .png, etc.

There is an undocumented PROC which does something similar. I wonder if it could be modified to list select files and folders, .xls for example which we can pass as a parameter, in this table as shown above. This proc is discussed at: http://www.kodyaz.com/articles/file-folder-list-xp_dirtree-sql-server-stored-procedure-recursive-cte.aspx
0
Comment
Question by:Mike Eghtebas
  • 2
  • 2
  • 2
  • +2
8 Comments
 
LVL 24

Expert Comment

by:Phillip Burton
ID: 40472433
Yes, you can.

You can use a recursive CTE to implement this.

However, I'm not quite clear - are you saying that the second example is what you have, and the first is what you want?

Now I'm looking at your link, the last example does use a recursive CTE to solve this.

The only thing which is missing is the spacing, and REPLICATE('. ', depth-1) would achieve that.
0
 
LVL 76

Expert Comment

by:arnold
ID: 40472515
Must it b e tsql, running from sql, you run into the possibility that the credentials with which mssql runs does not have rights on the path. using vbscript, powershell, c# might be simpler to navigate through the filesystem while building/updating the data in the sql.
0
 
LVL 33

Author Comment

by:Mike Eghtebas
ID: 40472533
Must it b e tsql
No.

run into ... credentials
Personal computer, windows authenticated.

using vbscript, powershell
do not know much about either.

c# might be simpler
c# solution will be much better to get started.
0
 
LVL 76

Assisted Solution

by:arnold
arnold earned 150 total points
ID: 40472704
MSDN has the tutorial on enumerating drives in a system, directories within a drive, files within a directory.

http://msdn.microsoft.com/en-us/library/6yk7a1b0.aspx
Includes references to how to read input data from a file.

C# interaction with a ms sql database is also covered within MSDN
USing the System.Data.SqlClient;
Double check which .Net framework is installed on your (you could test for the installed, and adjust accordingly)
http://msdn.microsoft.com/en-us/library/System.Data.SqlClient%28v=vs.100%29.aspx
the locating of drives in a system, traversing and locating directories, files is part of the attributes. When you find a file, you can then check its extension ....
You could use recurisve function that will initially receive the Drive letter, then will enumerate the contents, when a directory is located, it will call itself with the new path

Process_recursice (path, ID)
i.e. if you have three partition C (1), D (2), E (3), and you start (0)
Fk suggests that you have multiple tables, i.e. drive table, directory table, and file table?
directory table will have a FK from the drive table and a one to many relationship to the file table.
You would want to break them up such that you can query the data using a join versus running a sequential query on the same table. once a file is located, to locate the directories above it, and then the drive
file a is in directory A which is in directory 1, which is within directory,
The directory table will be queried multiple times to collect the chain.

c: (1)
   folder 1
              folder I
                          folder i
                                     file1

You should consider what it is you are using it for in an attempt to optimize the retrieval of data.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 16

Expert Comment

by:DcpKing
ID: 40473036
Just a thought: you can use the command line dir command to generate a file recursively through the file system, and parse that...
0
 
LVL 33

Author Comment

by:Mike Eghtebas
ID: 40473041
DcpKing,

Do you have a sample?
0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 250 total points
ID: 40474100
You can use MS's undocumented stored proc:
xp_dirtree

It has three parameters:
1.  directory name
       -- such as 'C:\'
2.  depth
        --how many subfolder levels below the starting directory to display.  0 = list all subfolders
3.  flag to list only folders or (files and folders)
        --0=list only folders;1=list files and folders.

So, for your request:

IF OBJECT_ID('tempdb.dbo.#xp_dirtree') IS NOT NULL
    DROP TABLE dbo.#xp_dirtree
CREATE TABLE #xp_dirtree (
    ident int identity(1, 1) not null,
    subdirectory nvarchar(255) null,
    depth smallint null,
    is_file bit null
    )

INSERT INTO #xp_dirtree ( subdirectory, depth, is_file )
EXEC master.sys.xp_dirtree 'C:', 0, 1

SELECT *
FROM #xp_dirtree
--WHERE is_file = 1 --list only files, not directories
--ORDER BY ...
0
 
LVL 16

Assisted Solution

by:DcpKing
DcpKing earned 100 total points
ID: 40475111
As an example of using the dir command, open a command prompt and you'll be at something like c:\users\eghtebas.
Enter "CD .." (without the quotes!) and press enter. Do this again and you should be at the root, C:\.
On my system I then entered
dir /?
to get some help. Once I had determined what I wanted I entered the following command (/S means "recurse through all subdirectories"; /B means "use bare format", so it skips all introduction and summary lines)

C:\>dir /S /B *.xls > results.txt

Open in new window


Then, when it was finished, I opened the new file called results.txt and found this:
C:\SSISMaster\Class Instructor Demo Schedule.xlsx
C:\SSISMaster\Data Files\Excel Example.xlsx
C:\SSISMaster\Data Files\USCustomers.xls
C:\Users\mai\Desktop\SSISMaster\Class Instructor Demo Schedule.xlsx
C:\Users\mai\Desktop\SSISMaster\Data Files\Excel Example.xlsx
C:\Users\mai\Desktop\SSISMaster\Data Files\USCustomers.xls
C:\Users\mai\Documents\Visual Studio 2010\Projects\2007Fint.xls
C:\Users\mai\Dropbox\Tola\patterns\Scarves and Shawls\PIShawl.xls

Open in new window


I can read these in to a SQL Server staging table very easily and then create whatever I want for data from them.
-------------------------------------------------
So, how to get this done.

You're using SQL Server 2005 or greater, so create a batch file to do your work. The file might read:
 
[embed=file 885772]del results.txt
dir /S /B c:\*.xls > results.txt

Open in new window


Then go to SSIS (it comes with your installation) and create a new project. Use an ExecuteProcess Task to run the batch file and then a flat file import to get the contents of the resulting file into the records of a table (see attached screengrab).

hth

Mike
ExecuteProcessTask.png
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now