Solved

SSIS Integration Services Question- how to duplicate rows on data-flow

Posted on 2010-08-18
3
809 Views
Last Modified: 2013-11-30
I have a table with Last_Name, Address,Zip etc.

Last_name has names like 'Tinker-Sackett' and 'Schiff-Seehafer'

I would like to find a way to split those names into the two parts and return duplicate rows of all other fields. So for example if I had a dataset with one row:
last_name = 'Tinker-Sackett'
zip = '99999'
I would want three  records coming out of data-flow:
1 with 'Tinker-Sackett' as last_name and zip = '99999'
1 with 'Tinker' as last_name and zip = '99999'
1 with 'Sackett' as last_name and zip = '99999'

So you can see- i need it split by a '-' and the '-' removed.

Is there a way to do this in a scripting object or even SQL?

Thanks.
0
Comment
Question by:mdoolittle
3 Comments
 
LVL 16

Accepted Solution

by:
carsRST earned 250 total points
ID: 33469682
Within SSIS, you might use a multicast to break up your dataset in to 3 identical ones.

Within each dataset (from the multicast), use a derived column to get just the data you want.  Use expressions to break up the last name.  

See link below for samples of parsing column data.
http://www.bidn.com/blogs/kylewalker/ssis/705/using-ssis-expressions-to-split-column-data

Then each multicast would end up loading the data to the same table.

Should have 3 rows for each record.

See this link for how to use multicast:
http://www.bimonkey.com/2009/07/the-multicast-transformation/



0
 
LVL 1

Assisted Solution

by:incidental74
incidental74 earned 250 total points
ID: 33469999

Hi there mdoolittle:

While it is doable in SSIS, another option would be to use SQL. It just comes down to what you are more comfortable with. With this code, you will also get a handy table-valued function to split other delimited lists in the future; I have used it for years.

(big props to a colleague of mine for the function: James S.)


/**********BEGIN FUNCTION******************/
create function [dbo].[udf_ParseToString] (@StringIn varchar(8000), @Delimiter varchar(10))
returns @data table (val varchar(8000))
as
/*-----------------------------------------------------------------------------
Comments:	Returns result with a parsed value per row.
--------------------------------------------------------------------------------*/
begin
  declare @start int, @end int
  if @StringIn is null set @StringIn = ''
  set @start = 1
  while @start <= len(@StringIn)
  begin
    set @end = charindex(@Delimiter, @StringIn, @start)
    if @end = 0
      set @end = len(@StringIn)
    else
      set @end = @end - 1 -- prior to delimiter
    if @end >= @start
      insert @data values (substring(@StringIn, @start, @end - @start + 1))
    set @start = @end + len(@Delimiter) + 1 -- skip past delimiter
  end
  return
end
/**********END FUNCTION******************/



declare @TestyWesty table(
    LastName varchar(255)
    ,ZipCode varchar(255)
)

insert into @TestyWesty values('Tinker-Bell', '99999')



select  LastName, ZipCode
from    @TestyWesty
union
select  ParseIt.val as LastName, ZipCode
from    @TestyWesty
cross
apply   dbo.udf_ParseToString(LastName, '-') as ParseIt

Open in new window

0
 

Author Comment

by:mdoolittle
ID: 33470045
Nice parsing script - i wonder with my table of 30 million rows if the UNION query will slow things down. I will try it out.

I will also try out derived columns in SSIS

Thanks!
0

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.

806 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question