Solved

SSIS - Import Excel file

Posted on 2007-04-11
14
1,514 Views
Last Modified: 2013-11-30
I want to import an Excel file to a table in Sql server with the SSIS
The data in the Excel file is organized as a matrix:
                       Sep          Oct           Nov               Dec
 Target         --------       -------        --------         -------
---------  
Target1        20             50                70                 10
Target2       30             50                20                   30
Taget3        50            70                90                   10

The Table in the Database Table1 has the following fields:
Month,    TargetName,         amount
In the SSIS when I declare the Excel source it automatically arrange the fields not as I want.
How do I control the fields in the Excel Source to transfer it the the OLEDB Sql Server destination?
Thankyou      
0
Comment
Question by:ANAT2403
  • 8
  • 6
14 Comments
 
LVL 13

Expert Comment

by:Megan Brooks
Comment Utility
If the destination is a table, the order of the columns in the destination is determined by the table definition. What kind of destination are you using?

Data flow components tend to list the most recently added columns last, but this shouldn't matter because the column oder in the lists does not have any special meaning. When you are viewing a list of columns, you can usually click on the column heading to sort the names.
0
 

Author Comment

by:ANAT2403
Comment Utility
Hi,
I don't understanf what you are saying and I don't see an answer to my problem.
I repeat the question with some corrections:

I want to import (source) an Excel file to  (destination) a table in Sql server with the SSIS
The data in the Excel file is organized as a matrix:
                       Sep          Oct           Nov               Dec
 Target         --------       -------        --------         -------
---------  
Target1        20             50                70                 10
Target2       30             50                20                   30
Taget3        50            70                90                   10

The Table in the Database Table1 has the following fields:
Month,    TargetName,         amount
In the SSIS when I declare the Excel source it automatically arrange the fields not as I want.
there is a problem because I have a matrix.
I want to get the following values in table1:
Month,    TargetName,   amount
-------     ---------------    -----------
sep          target1              20
oct            target1             50
nov            target1           70
dec            target1           10
sep            target2          30
oct           target2          50
nov         target2         20
dec         target2          30
sep         target3         50
oct           target3       70
nov          target3       90
dec         target3       10

How do I control the fields in the Excel Source to transfer it the the OLEDB Sql Server destination?
0
 
LVL 13

Expert Comment

by:Megan Brooks
Comment Utility
I see now. The Excel connection doesn't provide for doing this kind of transformation, but I think the work could be done in a script task that writes a extra row of data for each incoming row, for each additional column beyond the first.

If, however, you are going to run this package over and over again and the number of columns in the spreadsheet can change with each run, I think you would have to edit the package and update the data source and transformation task each time to reflect the change, unless you wanted to actually build or finalize the package at run time, which would be quite a bit more work.

Is this something that will only be run once, or is the number of input columns constant (for example, twelve columns, January thru December)?
0
 

Author Comment

by:ANAT2403
Comment Utility
Hi,
When you say a script task what do you mean can you give an exam[le or link to explanation about it?
Do I still have to consider the source as an Excel file or to take something else as a source?
Try to give me me links or examples to solve my problems.
I might run this more then once
0
 
LVL 13

Expert Comment

by:Megan Brooks
Comment Utility
I meant to say "script transformation." The data flow script component can be used as a transformation. You could insert one between the Excel source and the SQL destination. That works well as long as the source columns don't change. The script would scan the columns in each input record other than "Target" and write one output record for each "month" input column, and would repeat that until there were no more input records.

If that is going to work for you, I can provide an example of the script code. First, though, I am trying to find out if that is a practical solution in your case. If you are going to run the package repeatedly with worksheets containing different numbers of columns or different column names, you will have to manually edit the package each time there is a change in order to avoid validation failures. If you also need it to run without such manual intervention then we would need to think about solving the problem a different way.
0
 
LVL 13

Expert Comment

by:Megan Brooks
Comment Utility
Oh, script data flow components are described at http://msdn2.microsoft.com/en-us/library/ms137640.aspx.
0
 

Author Comment

by:ANAT2403
Comment Utility
Hi,
My source column will not change but I have to go in the Excel file to specific row and column.
An example will help me a lot.
The question is how can I refer to specific row in the Excel file or how do I know in which row I am?
Thanks a lot
Anat
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 13

Expert Comment

by:Megan Brooks
Comment Utility
A script transformation is called once for each row supplied by its input. The data for each of the columns can be found as properties in an object called "Row". If you sent the data from your example spreadsheet to a script task, the columns would appear as properties named Row.Target, Row.Sep, Row.Oct, Row.Nov and Row.Dec. You could the column data to determine which row you are in, or you could count the rows.
0
 

Author Comment

by:ANAT2403
Comment Utility
Hi,
This sounds very interesting. can you give me an example on how to count rows and  how to determine in which row I am?
0
 
LVL 13

Accepted Solution

by:
Megan Brooks earned 500 total points
Comment Utility
Here is the code from a simple script transformation that maintains a row count (Task_ID) and also inserts a new column (Batch_ID) into the data:
Public Class ScriptMain
    Inherits UserComponent
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        Static Dim Batch_ID As Guid = New Guid(Variables.BatchID)
        Static Dim Task_ID As Integer = Variables.idStart
        Row.BatchID = Batch_ID
        Task_ID += 1
        Row.TaskIDident = Task_ID
    End Sub
End Class

The starting value for Task_ID comes from a variable, but it could just as easily be initialized to zero, so that it reflects the number of the current row.

I created a simple package that reads your sample data above and writes an Excel file that looks like this:
Month      TargetName      Amount
Sep      Target1      20
Oct      Target1      50
Nov      Target1      70
Dec      Target1      10
Sep      Target2      30
Oct      Target2      50
Nov      Target2      20
Dec      Target2      30
Sep      Taget3      50
Oct      Taget3      70
Nov      Taget3      90
Dec      Taget3      10

The script code looks like this:
Public Class ScriptMain
    Inherits UserComponent
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        Output1Buffer.AddRow()
        Output1Buffer.TargetName = Row.Target
        Output1Buffer.Month = "Sep"
        Output1Buffer.Amount = Row.Sep
        Output1Buffer.AddRow()
        Output1Buffer.TargetName = Row.Target
        Output1Buffer.Month = "Oct"
        Output1Buffer.Amount = Row.Oct
        Output1Buffer.AddRow()
        Output1Buffer.TargetName = Row.Target
        Output1Buffer.Month = "Nov"
        Output1Buffer.Amount = Row.Nov
        Output1Buffer.AddRow()
        Output1Buffer.TargetName = Row.Target
        Output1Buffer.Month = "Dec"
        Output1Buffer.Amount = Row.Dec
    End Sub
End Class

This script references the columns by name, because that was all I had time for. I created a data flow with Excel source and destination, and I placed a script transformation component between them in the flow. In the script component under Inputs and Outputs I created a second output, Output1, and added the columns Month (unicode string), TargetName (unicode string), and Amount (double) to it. Doing that creates the Output1Buffer object for the script to use.

The script component uses a second output because that's the only way I could think of to write more than one row of output for one row of input. The default output data, which in this case would be identical to the input, is discarded. When you first connect the script component to the destination, the designer prompts for which output to use, and you must select Output1, the added output, not Output0.
0
 

Author Comment

by:ANAT2403
Comment Utility
Hi rscowden,
First I would like to thankyou a lot for the examples you sent me and the time you spent for this.
The example seems to answer my problems.
The second part worked for me excellent and in the first part I had small problems.
The problems were:
I declared the 2 variables in SSIS variables but when I called the Variables.BatchID they were not recognized. Do I have to write this in a different way?
Did you mean that I'll have 2 packages with 2 scripts or that everything will be in the same script.
Thanks alot and ofcourse I'll give you all the points soon.
Anat
0
 
LVL 13

Expert Comment

by:Megan Brooks
Comment Utility
You're welcome!

The Batch_ID column isn't important--I left it in when I copied the example from an actual SSIS package. To be included in the output, both new columns would have to be added to the column collection of the default output, if they didn't exist in the source. The Task_ID is an int (4-byte signed), while the Batch_ID is a unique identifier. If you only need the row count to use within the script and don't need it to be part of the output, you can do that as well.

The important thing is that the variable that maintians the count is declared Static, so that it is only initialized once. It retains its value between rows.

The two examples were to demonstrate two different things--keeping a count of rows, and writing several rows out for each row in. If all you need to do is to reformat the data, the second example is all you should need. If you want to do other things that involve knowing the row number, the first example shows a way to obtain it.
0
 

Author Comment

by:ANAT2403
Comment Utility
Hi rscowden,
Thanks again.
Just one small question I want to understand for the future.
If I declare variables like here,  in the SSIS variables as a package scope and then I want to call
them as you did   'Variables.BatchID'   or 'Variables.idStart'
This command didn't work. Do you know why? How do I call the variables that I declared?
Thankyou
Anat
0
 
LVL 13

Expert Comment

by:Megan Brooks
Comment Utility
Be sure to list the variables as read/write or read-only as appropriate on the Script page of the script task editor. You can also lock variables using Dts.VariableDispenser, but there is usually no reason to do it that way.
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Join & Write a Comment

Suggested Solutions

Nowadays, some of developer are too much worried about data. Who is using data, who is updating it etc. etc. Because, data is more costlier in term of money and information. So security of data is focusing concern in days. Lets' understand the Au…
Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Viewers will learn how the fundamental information of how to create a table.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now