Get List of Filenames from Main and Sub directories using SSIS

KuldeepReddy
KuldeepReddy used Ask the Experts™
on
Hi I have Multiple  Excel Files(2007) That are Stored in Different Directories in a main Directory Like and some times I will have Multiple sheets in each workbook
C:\Main\Sub1\name.xls
C:\Main\Sub1\name1.xls
C:\Main\Sub2\name2.xls
C:\Main\Sub3\name3.xls
C:\Main\Sub4\name4.xls
C:\Main\Sub4\name12.xls
And I have a Table (FileList) on My Database with Structure
ID, MainDir, SubDir, FileName,Sheetname, FileDate
What I have to do is, I need to Populate this table with all that information from the Files.
I need to do it in SSIS 2005
Any Help would be Appreciated.. Thanks in Advance..
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Reza RadConsultant, Trainer

Commented:
use a Foreach loop and set enumerator as file enumerator
add a dataflow task inside foreach loop
you need a variable to pass currect excel file to dataflow (in dataflow you must have excel source)
....
If you need more detail let me know...
the sample which PedroCGD suggested to you do it well.

Author

Commented:
Hi pedro, Thanks for the Solution. But in my case I have both .xls and .xlsx files. I can create a Excel connection manager for .xls file(Excel 97-2003) But for .xlsx(excel 2007) files I ma using OLEDB Connection for Access12. More over Each work sheet in Each Different file will go into a different table on Database. I have created a Package to Load data from these Files into different tables using individual Connections for each Worksheet. The Problem now is to read the name of each excel(.xls and .xlsx) files and insert them into a table. And My Excel files are saved like I have a Root Directory and It has Multiple Sub directories and Each Subdirectory will have one or more excel (.xls or .xlsx) files. I am Little confused in Looping over The Root Directory and into all the Sub folders and all the Files in the Subfolders to get the Filename and the Work Sheet name. My Destination table is Like
 (IDCol
 RootDir
SubDir
Filename
SheetName
DateCreated)
And My Root and Subdirectory Structures are like
C:\Main\Sub1\name.xlsx
C:\Main\Sub1\name1.xlsx
C:\Main\Sub2\name2.xlsx
C:\Main\Sub3\name3.xlsx
C:\Main\Sub4\name4.xlsx
C:\Main\Sub4\name12.xls
I am sorry In my Question I gave .xls for all the files.

Should you be charging more for IT Services?

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Commented:
so for example what you need is to detect if each file is a XLS or XLSX, confirm!?
If true i can create a script task for you to do it.
regards,
Pedro

Author

Commented:
Hi Pedro, What I want to do is just Loop over a Root/Main Directory and Traverse through all the sub folders it has and get all the Excel File names and the Sheet names into a table on my database. I just want to get only the File name and the SheetName along with the Directory and Sub directory information. i am just pasting some sample data that I want to go into my destination table.
MyDestinationTable.JPG

Author

Commented:
Sorry, that image is not visible. I just Want to get this Information from all the files and load into the destination table.
MyDestTable.JPG

Commented:
You know the folders and subfolders structure before running the package, correct?
What you think of store these structure in a textfile or other source and then use a ForEac Loop to get each file in each path!?

It's acceptable for you? If true I can follow this approach and create an example for you!
Regards,
Pedro

Author

Commented:
Hi Pedro, Yes for sure I know the Structure of all the Folders and Subfolders.

Commented:
And you dont need the data inside of each file? Only worksheet names and date created!?
regards,
Pedro

Author

Commented:
Yes, I just need the Names of Files and the Worksheets
Commented:
Dear friend,
Here you have your package. I hope it resolve your issue.

Now... Add the attached package to a SSIS Solution (rename to DTSX)
Execute the create table and create procedure to create a test sample
update the Paths.txt where is stored the structure
Update in SSIS package the connection to SQL Server

Helped?!
regards,
pedro
SSIS-Interface.JPG
Finalresult.JPG
Package-EE130-dtsx.txt
Paths.txt
Create-Table-sql.txt
Create-Procedure-sql.txt

Commented:
improvements!?!

Author

Commented:
Hi Pedro, In the Above package you have created an Excel Connection manager and thats good is there a way that I can create an Excel Connection for a Excel-2007 File(Excel 2007 file will be saved in .xlsx format and Excel connection manager is not recognising this format).

Commented:
If the file is an Excel 2007 you can rename the extension fom xls to xlsx that works also good.
Do you tested?!

Author

Commented:
Hi Pedro, I have tested and tried to give .xlsx file in the Excel Connection manager and that is not accepting the .xlsx format. I am Here with Attaching a Zip file which has all my exact Folder and file structures as well as a notepad that has all my locations specified.
LEADS.zip

Commented:
What I wanted to wrote as:

"If the file is an Excel 2003 you can rename the extension fom xls to xlsx that works also good."

If you use 2003 and 2007 you are using diferent providers...

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=F:\OUTROS\BLOG\EXPERTS EXCHANGE\SSIS_EE130\LEADS\LEADS\Davis_Content\Davis - Content.xls;Extended Properties="EXCEL 8.0;HDR=YES";

Author

Commented:
All My Files are of 2007 Format that is in .xlsx format

Author

Commented:
And Moreover If I give Connection string for One File/ Folder What about the other folders AS you might have seen from my Attachment  that there is one Root Folder (Leads) and in that Folder I have 10 sub folders and in each sub folder I have one file. If I give my connection string to only one file will that help. Sorry If I am confusing you.. I am not that experienced with SSIS.

Commented:
in teh for each loop container you have the option "Transpose subfolder"  (I dont remember if it's this exact name... check it...
or
And you can add in the text file the entire paths
C:\MAIN
C:\MAIN\Folder1
C:\MAIN\Folder2
...
Helped?

Author

Commented:
Hi Pedro, I have tried retesting the Package and the Package is Failing at the Script TASK and it is throwing an error msg like

Error: The Script returned a failure result.
Task SCR Check File Attributes failed

Commented:
ok... attach the package and I'll take a look tomorrow... you are a difficult customer..:-)
Regards,
pedro

Author

Commented:
Hi Pedro, I am very much sorry for the Trouble I am giving you. I am Attaching a Zip File Here In that I have All the Folder and Sub Folder Structures.

Leads_GetFileNames.txt --> (Rename it to .dtsx) this is the Actual Package
Paths --> Text file to Store the Paths
Create-Table-sql --> Table Structure to send the filenames (I have added an extra column SUBFOLDER)
Create-Procedure-sql --> Stored Procedure to Insert Data.
LEADS.zip

Commented:
ok... I'll take a look today or tomorrow morning depending on my free time.
regards,
Pedro

Author

Commented:
Thank you !

Commented:
Can I do it using SSIS 2008?

Author

Commented:
In my company they dont use SSIS 2008, They are still using 2005.

Author

Commented:
What will be the Difference between 2005 & 2008 in this case.

Commented:
because my laptop has SSIS 2008 with Excel 2007 and my office SSIS 2005 with Excel 2003... and you are using SSIS 2005 with Excel 2007...:-) but I'll look for a virtual machine...

Author

Commented:
Hi Pedro I am very much Thankful to you for the valuable time you are spending on this problem.

Author

Commented:
Hi Pedro, I have just modified My Files and File Structures a bit and there is much improvement. But the Only Problem now is with reading the Sheet Names in the "DFT Get WorkSheet names" Where the Script Component is supposed to read the SheetNames in Each Workbook But Insted It is Reading the Same WorkSheet Names for all the different Files.I am Attaching the Package and the File Structures Here, Advise me Please..


Leads_GetFileNames.txt --> (Rename it to .dtsx) this is the Actual Package
Paths --> Text file to Store the Paths
Create-Table-sql --> Table Structure to send the filenames and Worksheet Names
Create-Procedure-sql --> Stored Procedure to Insert Data.
Leads.zip

Author

Commented:
Thanks a lot Pedro This Solution Worked fine.

Commented:
Sorry for the delay... I was preparing a virtual machine for testing the solution for you. But If you already found the solution I'm very happy for you!
Regards,
Pedro

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial