A couple of weeks ago, my client requested me to implement a SSIS package that allows them to download their files from a FTP server and archives them. Microsoft SSIS is the powerful tool which allows us to proceed multiple files at same time even we can use SSIS to download files from FTP server.
Load multiple files in SSIS
Scenario
A couple of weeks ago, my client requested me to implement a SSIS package that allows them to download their files from a FTP server and archives them. Agency data, Invoice data and Sales data are uploaded to FTP server everyday. They need to download them and import into their warehouse database.
Approach
To implement this requirement, I used two main SSSI tasks which are FTP Task and Foreach Loop Container
FTP Task: is used to download files from FTP server
Foreach Loop Container: is used to load each downloaded file
Development environment
SQL Server 2014 Developer
SQL Server Data Tools for Visual Studio 2013
Data files
I created 3 files for Agency, Invoice and Sales data. Each file has 3 columns (Col1, Col2 and Col3)
Agency
Col1|Col3|Col3
A|1|3
Invoice
Col1|Col3|Col3
A|1|3
Sales
Col1|Col3|Col3
A|1|3
FTP Server
For demonstration purpose, I used DriveHQ service that supports FTP service. I created AAA_ArchivedData folder and then uploaded Agency data, Invoice data and Sales data.
I suggest you should create a free member account because you use them to configure FTP Task for next step
Create staging tables
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Agency]') AND type in (N'U'))BEGINCREATE TABLE [dbo].[Agency]( [Col1] [varchar](50) NULL, [Col2] [varchar](50) NULL, [Col3] [varchar](50) NULL) ON [PRIMARY]
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Invoice]') AND type in (N'U'))BEGINCREATE TABLE [dbo].[Invoice]( [Col1] [varchar](50) NULL, [Col2] [varchar](50) NULL, [Col3] [varchar](50) NULL) ON [PRIMARY]
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Sales]') AND type in (N'U'))BEGINCREATE TABLE [dbo].[Sales]( [Col1] [varchar](50) NULL, [Col2] [varchar](50) NULL, [Col3] [varchar](50) NULL) ON [PRIMARY]
Create a new package named Load_Multiple_CSV_Files.dtsx
Create variables
ArchivedFolder: F:\SSIS-Practice\Load_Multiple_CSV_Files\ArchivedFolder where to store archived files
FileConnection
FileType
OLEDBConnection: connection string to SQL Database Server
Value: Data Source=HSSSC1PCL01198\SQLSERVER2014; Initial Catalog=Demo;Provider=SQLNCLI11.1;Integrated Security=SSPI;
Data Source: database server name Initial Catalog: database name Provider: the driver to connect to SQL Database Server. You need to change the value to meet your environment. I use SQLNCLI11.1 because SQL Server Engine is 2014. If your version is lower than SQL 2012 version, it should be 10.0 / 10.1 Integrated Security: means that I use Window Authentication mode while connection to SQL Database Server.
SourceFolder: F:\SSIS-Practice\Load_Multiple_CSV_Files\DataFile where to store downloaded files from FTP server
Create connections on Connection Manager panel
Create OLEDBConnection and assign OLEDBConnection variable to ConnectionString property
Create 3 connections for CSV files ( Agency, Invoice and Sales) and assign FileConnection variable to ConnectionString property
Create ArchivedFolder connection and assign ArchivedFolder variable to ConnectionString property
Drag and drop FTP Task component
Double click to open FTP Task Editor
Create new FTP connection to open FTP Connection Manager Editor
Server port: 21 by default. The FTP site listens to this port. If you us another port, you need to make sure that it is not blocked by firewall.
Credentials
User name: the account that you registered with DriveHQ
Password: the password that you registered with DriveHQ
Options
Check on Use Passive mode: it means that the client side initiates the data connection. If you un-check, it means you use Active mode. In Active mode, the server side initiates the data connection and it is DriveHQ in this case.
Click on Test Connection to make sure you can connect to DriveHQ successfully
Click OK to come back FTP Task Editor
Go to File Transfer tab
Local Parameters
IsLocalPathVariable: True
LocalVariable: User::SourceFolder. We specify where to download data files from DriveHQ
Operation
Operation: Receive files. There are a lot of actions such as Send files, Receive files, Create local directory,... In this case, we want to download the files so we use Receive files
IsTransferAscii: True accepts Ascii code
Remote Parameters
IsRemotePathVariable: False
RemotePath: /AAA_ArchivedData/*.txt. The option specifies where we download data files on DriveHQ. The AAA_ArchivedData is created above step and I also uploaded the testing files. *.txt specifies that we download all files with txt extension
Click OK
Drag and drop Execute SQL Task component
This task is to truncate the tables that created before
Drag and drop Foreach Loop Container component
Double click to open Foreach Loop Editor
Go to Collection tab
Enumerator: Foreach File Enumerator
Add expression Directory: @[User::SourceFolder]
Enumerator Configuration
Files: *.txt it means to load text files only
Check on Fully qualified
Go to Variable Mappings
User::FileConnection
Index = 0
Click OK
Drag and drop Script Task component into Foreach Loop and name it as Parse File Name
This task is used to detect what kind of file type (Agency, Invoice or Sales)
Double click to open Script Task Editor
ReadOnlyVariables: User::FileConnection
ReadWriteVariables: User::FileType
Click Edit script... and enter the following code
public void Main() { // TODO: Add your code here string filepath = Dts.Variables["FileConnection"].Value.ToString(); string filename = Path.GetFileName(filepath); if (filename.ToUpper().Contains("INVOICE")) { Dts.Variables["FileType"].Value = "Invoice"; } else if ((filename.ToUpper().Contains("AGENCY"))) { Dts.Variables["FileType"].Value = "Agency"; } else if ((filename.ToUpper().Contains("SALES"))) { Dts.Variables["FileType"].Value = "Sales"; } else Dts.Variables["FileType"].Value = "Unknown"; Dts.TaskResult = (int)ScriptResults.Success; }
Click OK
Drag and drop Data Flow Task component and name it as Load Invoice
Make a connection between Parse File Name script task and Load Invoice
Double click on the connection to open Precedence Constraint Editor
Evaluation operation: Expression
Expression: @[User::FileType]=="Invoice"
Click on Logical OR....
Click OK
Double click on Load Invoice data flow
Drag and drop Flat File Source component and name it as Invoice File
Flat file connection manager: Invoice
Columns: all columns
Drag and drop OLEDB Destination and name is as Invoice Destination
OLEDB connection manger: OLEDBConnection
Data access view: Table and view - fast load
Name of the table or the view: Invoice
Mapping with flat file source
Drag and drop Data Flow Task component and name it as Load Agency
Repeat all steps which we implement for Invoice but change Expression of Precedence Constraint Editor to @[User::FileType]=="Agency"
Repeat all steps which we implement for Invoice but change Expression of Precedence Constraint Editor to @[User::FileType]=="Sales"
Drag and drop File System Task component
Double click to open File System Task Editor
IsDestinationPathVariable:False
DestinationConnection: ArchivedFolder. This is the folder connection we created before
OverwriteDestination: False
Operation: Move File. There are a lot of actions but we use Move File to archive data files after proceeding them successfully
IsSourcePathVariable: True
SourceConnection: User::FileConnection
Execute SSIS package
Finally, our package has been done and we need to execute it to see how it works?
Check the folder F:\SSIS-Practice\Load_Multiple_CSV_Files\DataFile to make sure there are no data files
Check the folder F:\SSIS-Practice\Load_Multiple_CSV_Files\ArchivedFolder to make sure there are no data files
Query data from Agency, Invoice and Sales tables to make sure that they are empty
Executing the package
Verifying the result after executing the package
Excellent, data imported and data files were also archived to the folder as our expectation. In my example, the archived folder is local, you can still use FTP Task component to upload the files to anywhere.
Conclusions
In real life, there are a lot of scenarios that will require you to combine SSIS components together. Hopefully, you will have an overview how to use FTP Task and Foreach Loop Container to proceed multiple files at same time through my example.
I have the same scenario, i followed all the steps and it executed excellent. Only one thing i missed is i'm not getting the Archive files with dates
while moving from source to Archive folder. Will you please help me to get the right thing how you got the Filename with date in Archive folder.
Comments (2)
Commented:
I have the same scenario, i followed all the steps and it executed excellent. Only one thing i missed is i'm not getting the Archive files with dates
while moving from source to Archive folder. Will you please help me to get the right thing how you got the Filename with date in Archive folder.
Thanks a lot in advance
Author
Commented:Can you describe more detail about your expectation? Do you mean you want to add DATE to name of files before moving them to Archive folder?
Cheers,