ANAT2403
asked on
SSIS - Import Excel file
I want to import an Excel file to a table in Sql server with the SSIS
The data in the Excel file is organized as a matrix:
Sep Oct Nov Dec
Target -------- ------- -------- -------
---------
Target1 20 50 70 10
Target2 30 50 20 30
Taget3 50 70 90 10
The Table in the Database Table1 has the following fields:
Month, TargetName, amount
In the SSIS when I declare the Excel source it automatically arrange the fields not as I want.
How do I control the fields in the Excel Source to transfer it the the OLEDB Sql Server destination?
Thankyou
The data in the Excel file is organized as a matrix:
Sep Oct Nov Dec
Target -------- ------- -------- -------
---------
Target1 20 50 70 10
Target2 30 50 20 30
Taget3 50 70 90 10
The Table in the Database Table1 has the following fields:
Month, TargetName, amount
In the SSIS when I declare the Excel source it automatically arrange the fields not as I want.
How do I control the fields in the Excel Source to transfer it the the OLEDB Sql Server destination?
Thankyou
ASKER
Hi,
I don't understanf what you are saying and I don't see an answer to my problem.
I repeat the question with some corrections:
I want to import (source) an Excel file to (destination) a table in Sql server with the SSIS
The data in the Excel file is organized as a matrix:
Sep Oct Nov Dec
Target -------- ------- -------- -------
---------
Target1 20 50 70 10
Target2 30 50 20 30
Taget3 50 70 90 10
The Table in the Database Table1 has the following fields:
Month, TargetName, amount
In the SSIS when I declare the Excel source it automatically arrange the fields not as I want.
there is a problem because I have a matrix.
I want to get the following values in table1:
Month, TargetName, amount
------- --------------- -----------
sep target1 20
oct target1 50
nov target1 70
dec target1 10
sep target2 30
oct target2 50
nov target2 20
dec target2 30
sep target3 50
oct target3 70
nov target3 90
dec target3 10
How do I control the fields in the Excel Source to transfer it the the OLEDB Sql Server destination?
I don't understanf what you are saying and I don't see an answer to my problem.
I repeat the question with some corrections:
I want to import (source) an Excel file to (destination) a table in Sql server with the SSIS
The data in the Excel file is organized as a matrix:
Sep Oct Nov Dec
Target -------- ------- -------- -------
---------
Target1 20 50 70 10
Target2 30 50 20 30
Taget3 50 70 90 10
The Table in the Database Table1 has the following fields:
Month, TargetName, amount
In the SSIS when I declare the Excel source it automatically arrange the fields not as I want.
there is a problem because I have a matrix.
I want to get the following values in table1:
Month, TargetName, amount
------- --------------- -----------
sep target1 20
oct target1 50
nov target1 70
dec target1 10
sep target2 30
oct target2 50
nov target2 20
dec target2 30
sep target3 50
oct target3 70
nov target3 90
dec target3 10
How do I control the fields in the Excel Source to transfer it the the OLEDB Sql Server destination?
I see now. The Excel connection doesn't provide for doing this kind of transformation, but I think the work could be done in a script task that writes a extra row of data for each incoming row, for each additional column beyond the first.
If, however, you are going to run this package over and over again and the number of columns in the spreadsheet can change with each run, I think you would have to edit the package and update the data source and transformation task each time to reflect the change, unless you wanted to actually build or finalize the package at run time, which would be quite a bit more work.
Is this something that will only be run once, or is the number of input columns constant (for example, twelve columns, January thru December)?
If, however, you are going to run this package over and over again and the number of columns in the spreadsheet can change with each run, I think you would have to edit the package and update the data source and transformation task each time to reflect the change, unless you wanted to actually build or finalize the package at run time, which would be quite a bit more work.
Is this something that will only be run once, or is the number of input columns constant (for example, twelve columns, January thru December)?
ASKER
Hi,
When you say a script task what do you mean can you give an exam[le or link to explanation about it?
Do I still have to consider the source as an Excel file or to take something else as a source?
Try to give me me links or examples to solve my problems.
I might run this more then once
When you say a script task what do you mean can you give an exam[le or link to explanation about it?
Do I still have to consider the source as an Excel file or to take something else as a source?
Try to give me me links or examples to solve my problems.
I might run this more then once
I meant to say "script transformation." The data flow script component can be used as a transformation. You could insert one between the Excel source and the SQL destination. That works well as long as the source columns don't change. The script would scan the columns in each input record other than "Target" and write one output record for each "month" input column, and would repeat that until there were no more input records.
If that is going to work for you, I can provide an example of the script code. First, though, I am trying to find out if that is a practical solution in your case. If you are going to run the package repeatedly with worksheets containing different numbers of columns or different column names, you will have to manually edit the package each time there is a change in order to avoid validation failures. If you also need it to run without such manual intervention then we would need to think about solving the problem a different way.
If that is going to work for you, I can provide an example of the script code. First, though, I am trying to find out if that is a practical solution in your case. If you are going to run the package repeatedly with worksheets containing different numbers of columns or different column names, you will have to manually edit the package each time there is a change in order to avoid validation failures. If you also need it to run without such manual intervention then we would need to think about solving the problem a different way.
Oh, script data flow components are described at http://msdn2.microsoft.com/en-us/library/ms137640.aspx.
ASKER
Hi,
My source column will not change but I have to go in the Excel file to specific row and column.
An example will help me a lot.
The question is how can I refer to specific row in the Excel file or how do I know in which row I am?
Thanks a lot
Anat
My source column will not change but I have to go in the Excel file to specific row and column.
An example will help me a lot.
The question is how can I refer to specific row in the Excel file or how do I know in which row I am?
Thanks a lot
Anat
A script transformation is called once for each row supplied by its input. The data for each of the columns can be found as properties in an object called "Row". If you sent the data from your example spreadsheet to a script task, the columns would appear as properties named Row.Target, Row.Sep, Row.Oct, Row.Nov and Row.Dec. You could the column data to determine which row you are in, or you could count the rows.
ASKER
Hi,
This sounds very interesting. can you give me an example on how to count rows and how to determine in which row I am?
This sounds very interesting. can you give me an example on how to count rows and how to determine in which row I am?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi rscowden,
First I would like to thankyou a lot for the examples you sent me and the time you spent for this.
The example seems to answer my problems.
The second part worked for me excellent and in the first part I had small problems.
The problems were:
I declared the 2 variables in SSIS variables but when I called the Variables.BatchID they were not recognized. Do I have to write this in a different way?
Did you mean that I'll have 2 packages with 2 scripts or that everything will be in the same script.
Thanks alot and ofcourse I'll give you all the points soon.
Anat
First I would like to thankyou a lot for the examples you sent me and the time you spent for this.
The example seems to answer my problems.
The second part worked for me excellent and in the first part I had small problems.
The problems were:
I declared the 2 variables in SSIS variables but when I called the Variables.BatchID they were not recognized. Do I have to write this in a different way?
Did you mean that I'll have 2 packages with 2 scripts or that everything will be in the same script.
Thanks alot and ofcourse I'll give you all the points soon.
Anat
You're welcome!
The Batch_ID column isn't important--I left it in when I copied the example from an actual SSIS package. To be included in the output, both new columns would have to be added to the column collection of the default output, if they didn't exist in the source. The Task_ID is an int (4-byte signed), while the Batch_ID is a unique identifier. If you only need the row count to use within the script and don't need it to be part of the output, you can do that as well.
The important thing is that the variable that maintians the count is declared Static, so that it is only initialized once. It retains its value between rows.
The two examples were to demonstrate two different things--keeping a count of rows, and writing several rows out for each row in. If all you need to do is to reformat the data, the second example is all you should need. If you want to do other things that involve knowing the row number, the first example shows a way to obtain it.
The Batch_ID column isn't important--I left it in when I copied the example from an actual SSIS package. To be included in the output, both new columns would have to be added to the column collection of the default output, if they didn't exist in the source. The Task_ID is an int (4-byte signed), while the Batch_ID is a unique identifier. If you only need the row count to use within the script and don't need it to be part of the output, you can do that as well.
The important thing is that the variable that maintians the count is declared Static, so that it is only initialized once. It retains its value between rows.
The two examples were to demonstrate two different things--keeping a count of rows, and writing several rows out for each row in. If all you need to do is to reformat the data, the second example is all you should need. If you want to do other things that involve knowing the row number, the first example shows a way to obtain it.
ASKER
Hi rscowden,
Thanks again.
Just one small question I want to understand for the future.
If I declare variables like here, in the SSIS variables as a package scope and then I want to call
them as you did 'Variables.BatchID' or 'Variables.idStart'
This command didn't work. Do you know why? How do I call the variables that I declared?
Thankyou
Anat
Thanks again.
Just one small question I want to understand for the future.
If I declare variables like here, in the SSIS variables as a package scope and then I want to call
them as you did 'Variables.BatchID' or 'Variables.idStart'
This command didn't work. Do you know why? How do I call the variables that I declared?
Thankyou
Anat
Be sure to list the variables as read/write or read-only as appropriate on the Script page of the script task editor. You can also lock variables using Dts.VariableDispenser, but there is usually no reason to do it that way.
Data flow components tend to list the most recently added columns last, but this shouldn't matter because the column oder in the lists does not have any special meaning. When you are viewing a list of columns, you can usually click on the column heading to sort the names.