Link to home
Start Free TrialLog in
Avatar of KuldeepReddy
KuldeepReddy

asked on

Get List of Filenames from Main and Sub directories using SSIS

Hi I have Multiple  Excel Files(2007) That are Stored in Different Directories in a main Directory Like and some times I will have Multiple sheets in each workbook
C:\Main\Sub1\name.xls
C:\Main\Sub1\name1.xls
C:\Main\Sub2\name2.xls
C:\Main\Sub3\name3.xls
C:\Main\Sub4\name4.xls
C:\Main\Sub4\name12.xls
And I have a Table (FileList) on My Database with Structure
ID, MainDir, SubDir, FileName,Sheetname, FileDate
What I have to do is, I need to Populate this table with all that information from the Files.
I need to do it in SSIS 2005
Any Help would be Appreciated.. Thanks in Advance..
Avatar of PedroCGD
PedroCGD
Flag of Portugal image

Avatar of Reza Rad
use a Foreach loop and set enumerator as file enumerator
add a dataflow task inside foreach loop
you need a variable to pass currect excel file to dataflow (in dataflow you must have excel source)
....
If you need more detail let me know...
the sample which PedroCGD suggested to you do it well.
Avatar of KuldeepReddy
KuldeepReddy

ASKER

Hi pedro, Thanks for the Solution. But in my case I have both .xls and .xlsx files. I can create a Excel connection manager for .xls file(Excel 97-2003) But for .xlsx(excel 2007) files I ma using OLEDB Connection for Access12. More over Each work sheet in Each Different file will go into a different table on Database. I have created a Package to Load data from these Files into different tables using individual Connections for each Worksheet. The Problem now is to read the name of each excel(.xls and .xlsx) files and insert them into a table. And My Excel files are saved like I have a Root Directory and It has Multiple Sub directories and Each Subdirectory will have one or more excel (.xls or .xlsx) files. I am Little confused in Looping over The Root Directory and into all the Sub folders and all the Files in the Subfolders to get the Filename and the Work Sheet name. My Destination table is Like
 (IDCol
 RootDir
SubDir
Filename
SheetName
DateCreated)
And My Root and Subdirectory Structures are like
C:\Main\Sub1\name.xlsx
C:\Main\Sub1\name1.xlsx
C:\Main\Sub2\name2.xlsx
C:\Main\Sub3\name3.xlsx
C:\Main\Sub4\name4.xlsx
C:\Main\Sub4\name12.xls
I am sorry In my Question I gave .xls for all the files.

so for example what you need is to detect if each file is a XLS or XLSX, confirm!?
If true i can create a script task for you to do it.
regards,
Pedro
Hi Pedro, What I want to do is just Loop over a Root/Main Directory and Traverse through all the sub folders it has and get all the Excel File names and the Sheet names into a table on my database. I just want to get only the File name and the SheetName along with the Directory and Sub directory information. i am just pasting some sample data that I want to go into my destination table.
MyDestinationTable.JPG
Sorry, that image is not visible. I just Want to get this Information from all the files and load into the destination table.
MyDestTable.JPG
You know the folders and subfolders structure before running the package, correct?
What you think of store these structure in a textfile or other source and then use a ForEac Loop to get each file in each path!?

It's acceptable for you? If true I can follow this approach and create an example for you!
Regards,
Pedro
Hi Pedro, Yes for sure I know the Structure of all the Folders and Subfolders.
And you dont need the data inside of each file? Only worksheet names and date created!?
regards,
Pedro
Yes, I just need the Names of Files and the Worksheets
ASKER CERTIFIED SOLUTION
Avatar of PedroCGD
PedroCGD
Flag of Portugal image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
improvements!?!
Hi Pedro, In the Above package you have created an Excel Connection manager and thats good is there a way that I can create an Excel Connection for a Excel-2007 File(Excel 2007 file will be saved in .xlsx format and Excel connection manager is not recognising this format).
If the file is an Excel 2007 you can rename the extension fom xls to xlsx that works also good.
Do you tested?!
Hi Pedro, I have tested and tried to give .xlsx file in the Excel Connection manager and that is not accepting the .xlsx format. I am Here with Attaching a Zip file which has all my exact Folder and file structures as well as a notepad that has all my locations specified.
LEADS.zip
What I wanted to wrote as:

"If the file is an Excel 2003 you can rename the extension fom xls to xlsx that works also good."

If you use 2003 and 2007 you are using diferent providers...

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=F:\OUTROS\BLOG\EXPERTS EXCHANGE\SSIS_EE130\LEADS\LEADS\Davis_Content\Davis - Content.xls;Extended Properties="EXCEL 8.0;HDR=YES";
All My Files are of 2007 Format that is in .xlsx format
And Moreover If I give Connection string for One File/ Folder What about the other folders AS you might have seen from my Attachment  that there is one Root Folder (Leads) and in that Folder I have 10 sub folders and in each sub folder I have one file. If I give my connection string to only one file will that help. Sorry If I am confusing you.. I am not that experienced with SSIS.
in teh for each loop container you have the option "Transpose subfolder"  (I dont remember if it's this exact name... check it...
or
And you can add in the text file the entire paths
C:\MAIN
C:\MAIN\Folder1
C:\MAIN\Folder2
...
Helped?
Hi Pedro, I have tried retesting the Package and the Package is Failing at the Script TASK and it is throwing an error msg like

Error: The Script returned a failure result.
Task SCR Check File Attributes failed
ok... attach the package and I'll take a look tomorrow... you are a difficult customer..:-)
Regards,
pedro
Hi Pedro, I am very much sorry for the Trouble I am giving you. I am Attaching a Zip File Here In that I have All the Folder and Sub Folder Structures.

Leads_GetFileNames.txt --> (Rename it to .dtsx) this is the Actual Package
Paths --> Text file to Store the Paths
Create-Table-sql --> Table Structure to send the filenames (I have added an extra column SUBFOLDER)
Create-Procedure-sql --> Stored Procedure to Insert Data.
LEADS.zip
ok... I'll take a look today or tomorrow morning depending on my free time.
regards,
Pedro
Thank you !
Can I do it using SSIS 2008?
In my company they dont use SSIS 2008, They are still using 2005.
What will be the Difference between 2005 & 2008 in this case.
because my laptop has SSIS 2008 with Excel 2007 and my office SSIS 2005 with Excel 2003... and you are using SSIS 2005 with Excel 2007...:-) but I'll look for a virtual machine...
Hi Pedro I am very much Thankful to you for the valuable time you are spending on this problem.
Hi Pedro, I have just modified My Files and File Structures a bit and there is much improvement. But the Only Problem now is with reading the Sheet Names in the "DFT Get WorkSheet names" Where the Script Component is supposed to read the SheetNames in Each Workbook But Insted It is Reading the Same WorkSheet Names for all the different Files.I am Attaching the Package and the File Structures Here, Advise me Please..


Leads_GetFileNames.txt --> (Rename it to .dtsx) this is the Actual Package
Paths --> Text file to Store the Paths
Create-Table-sql --> Table Structure to send the filenames and Worksheet Names
Create-Procedure-sql --> Stored Procedure to Insert Data.
Leads.zip
Thanks a lot Pedro This Solution Worked fine.
Sorry for the delay... I was preparing a virtual machine for testing the solution for you. But If you already found the solution I'm very happy for you!
Regards,
Pedro