Solved

SSIS - Import Excel file

Posted on 2007-04-11
14
1,526 Views
Last Modified: 2013-11-30
I want to import an Excel file to a table in Sql server with the SSIS
The data in the Excel file is organized as a matrix:
                       Sep          Oct           Nov               Dec
 Target         --------       -------        --------         -------
---------  
Target1        20             50                70                 10
Target2       30             50                20                   30
Taget3        50            70                90                   10

The Table in the Database Table1 has the following fields:
Month,    TargetName,         amount
In the SSIS when I declare the Excel source it automatically arrange the fields not as I want.
How do I control the fields in the Excel Source to transfer it the the OLEDB Sql Server destination?
Thankyou      
0
Comment
Question by:ANAT2403
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 6
14 Comments
 
LVL 15

Expert Comment

by:Megan Brooks
ID: 18890660
If the destination is a table, the order of the columns in the destination is determined by the table definition. What kind of destination are you using?

Data flow components tend to list the most recently added columns last, but this shouldn't matter because the column oder in the lists does not have any special meaning. When you are viewing a list of columns, you can usually click on the column heading to sort the names.
0
 

Author Comment

by:ANAT2403
ID: 18893286
Hi,
I don't understanf what you are saying and I don't see an answer to my problem.
I repeat the question with some corrections:

I want to import (source) an Excel file to  (destination) a table in Sql server with the SSIS
The data in the Excel file is organized as a matrix:
                       Sep          Oct           Nov               Dec
 Target         --------       -------        --------         -------
---------  
Target1        20             50                70                 10
Target2       30             50                20                   30
Taget3        50            70                90                   10

The Table in the Database Table1 has the following fields:
Month,    TargetName,         amount
In the SSIS when I declare the Excel source it automatically arrange the fields not as I want.
there is a problem because I have a matrix.
I want to get the following values in table1:
Month,    TargetName,   amount
-------     ---------------    -----------
sep          target1              20
oct            target1             50
nov            target1           70
dec            target1           10
sep            target2          30
oct           target2          50
nov         target2         20
dec         target2          30
sep         target3         50
oct           target3       70
nov          target3       90
dec         target3       10

How do I control the fields in the Excel Source to transfer it the the OLEDB Sql Server destination?
0
 
LVL 15

Expert Comment

by:Megan Brooks
ID: 18893584
I see now. The Excel connection doesn't provide for doing this kind of transformation, but I think the work could be done in a script task that writes a extra row of data for each incoming row, for each additional column beyond the first.

If, however, you are going to run this package over and over again and the number of columns in the spreadsheet can change with each run, I think you would have to edit the package and update the data source and transformation task each time to reflect the change, unless you wanted to actually build or finalize the package at run time, which would be quite a bit more work.

Is this something that will only be run once, or is the number of input columns constant (for example, twelve columns, January thru December)?
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 

Author Comment

by:ANAT2403
ID: 18895989
Hi,
When you say a script task what do you mean can you give an exam[le or link to explanation about it?
Do I still have to consider the source as an Excel file or to take something else as a source?
Try to give me me links or examples to solve my problems.
I might run this more then once
0
 
LVL 15

Expert Comment

by:Megan Brooks
ID: 18896143
I meant to say "script transformation." The data flow script component can be used as a transformation. You could insert one between the Excel source and the SQL destination. That works well as long as the source columns don't change. The script would scan the columns in each input record other than "Target" and write one output record for each "month" input column, and would repeat that until there were no more input records.

If that is going to work for you, I can provide an example of the script code. First, though, I am trying to find out if that is a practical solution in your case. If you are going to run the package repeatedly with worksheets containing different numbers of columns or different column names, you will have to manually edit the package each time there is a change in order to avoid validation failures. If you also need it to run without such manual intervention then we would need to think about solving the problem a different way.
0
 
LVL 15

Expert Comment

by:Megan Brooks
ID: 18896151
Oh, script data flow components are described at http://msdn2.microsoft.com/en-us/library/ms137640.aspx.
0
 

Author Comment

by:ANAT2403
ID: 18896437
Hi,
My source column will not change but I have to go in the Excel file to specific row and column.
An example will help me a lot.
The question is how can I refer to specific row in the Excel file or how do I know in which row I am?
Thanks a lot
Anat
0
 
LVL 15

Expert Comment

by:Megan Brooks
ID: 18899609
A script transformation is called once for each row supplied by its input. The data for each of the columns can be found as properties in an object called "Row". If you sent the data from your example spreadsheet to a script task, the columns would appear as properties named Row.Target, Row.Sep, Row.Oct, Row.Nov and Row.Dec. You could the column data to determine which row you are in, or you could count the rows.
0
 

Author Comment

by:ANAT2403
ID: 18908302
Hi,
This sounds very interesting. can you give me an example on how to count rows and  how to determine in which row I am?
0
 
LVL 15

Accepted Solution

by:
Megan Brooks earned 500 total points
ID: 18909376
Here is the code from a simple script transformation that maintains a row count (Task_ID) and also inserts a new column (Batch_ID) into the data:
Public Class ScriptMain
    Inherits UserComponent
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        Static Dim Batch_ID As Guid = New Guid(Variables.BatchID)
        Static Dim Task_ID As Integer = Variables.idStart
        Row.BatchID = Batch_ID
        Task_ID += 1
        Row.TaskIDident = Task_ID
    End Sub
End Class

The starting value for Task_ID comes from a variable, but it could just as easily be initialized to zero, so that it reflects the number of the current row.

I created a simple package that reads your sample data above and writes an Excel file that looks like this:
Month      TargetName      Amount
Sep      Target1      20
Oct      Target1      50
Nov      Target1      70
Dec      Target1      10
Sep      Target2      30
Oct      Target2      50
Nov      Target2      20
Dec      Target2      30
Sep      Taget3      50
Oct      Taget3      70
Nov      Taget3      90
Dec      Taget3      10

The script code looks like this:
Public Class ScriptMain
    Inherits UserComponent
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        Output1Buffer.AddRow()
        Output1Buffer.TargetName = Row.Target
        Output1Buffer.Month = "Sep"
        Output1Buffer.Amount = Row.Sep
        Output1Buffer.AddRow()
        Output1Buffer.TargetName = Row.Target
        Output1Buffer.Month = "Oct"
        Output1Buffer.Amount = Row.Oct
        Output1Buffer.AddRow()
        Output1Buffer.TargetName = Row.Target
        Output1Buffer.Month = "Nov"
        Output1Buffer.Amount = Row.Nov
        Output1Buffer.AddRow()
        Output1Buffer.TargetName = Row.Target
        Output1Buffer.Month = "Dec"
        Output1Buffer.Amount = Row.Dec
    End Sub
End Class

This script references the columns by name, because that was all I had time for. I created a data flow with Excel source and destination, and I placed a script transformation component between them in the flow. In the script component under Inputs and Outputs I created a second output, Output1, and added the columns Month (unicode string), TargetName (unicode string), and Amount (double) to it. Doing that creates the Output1Buffer object for the script to use.

The script component uses a second output because that's the only way I could think of to write more than one row of output for one row of input. The default output data, which in this case would be identical to the input, is discarded. When you first connect the script component to the destination, the designer prompts for which output to use, and you must select Output1, the added output, not Output0.
0
 

Author Comment

by:ANAT2403
ID: 18911954
Hi rscowden,
First I would like to thankyou a lot for the examples you sent me and the time you spent for this.
The example seems to answer my problems.
The second part worked for me excellent and in the first part I had small problems.
The problems were:
I declared the 2 variables in SSIS variables but when I called the Variables.BatchID they were not recognized. Do I have to write this in a different way?
Did you mean that I'll have 2 packages with 2 scripts or that everything will be in the same script.
Thanks alot and ofcourse I'll give you all the points soon.
Anat
0
 
LVL 15

Expert Comment

by:Megan Brooks
ID: 18912177
You're welcome!

The Batch_ID column isn't important--I left it in when I copied the example from an actual SSIS package. To be included in the output, both new columns would have to be added to the column collection of the default output, if they didn't exist in the source. The Task_ID is an int (4-byte signed), while the Batch_ID is a unique identifier. If you only need the row count to use within the script and don't need it to be part of the output, you can do that as well.

The important thing is that the variable that maintians the count is declared Static, so that it is only initialized once. It retains its value between rows.

The two examples were to demonstrate two different things--keeping a count of rows, and writing several rows out for each row in. If all you need to do is to reformat the data, the second example is all you should need. If you want to do other things that involve knowing the row number, the first example shows a way to obtain it.
0
 

Author Comment

by:ANAT2403
ID: 18913091
Hi rscowden,
Thanks again.
Just one small question I want to understand for the future.
If I declare variables like here,  in the SSIS variables as a package scope and then I want to call
them as you did   'Variables.BatchID'   or 'Variables.idStart'
This command didn't work. Do you know why? How do I call the variables that I declared?
Thankyou
Anat
0
 
LVL 15

Expert Comment

by:Megan Brooks
ID: 18914001
Be sure to list the variables as read/write or read-only as appropriate on the Script page of the script task editor. You can also lock variables using Dts.VariableDispenser, but there is usually no reason to do it that way.
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
I have a large data set and a SSIS package. How can I load this file in multi threading?
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

751 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question