Link to home
Start Free TrialLog in
Avatar of lapucca
lapucca

asked on

How to import a column data that is delimited by double space?

Hi, I need to import/merge a table into multiple tables in the same database.  I don't really know ssis so I thought I'll do it in steps using tsql scripts.  I have a column, that has data like below
"Breast  Sarcoma  Head and Neck  Prostate Malignancies"

Open in new window

I want to import this data into a table that has a personId column and a column for each of the item above that is separated by 2 spaces.  How can I do that in query?  Thank you. the entries in the table inserted to would have something like this for above:
personID          Treatment area
10001               Breast  
10001               Sarcoma  
10001               Head and Neck
10001               Prostate Malignancies

Open in new window

Avatar of radcaesar
radcaesar
Flag of India image

On which strategy you split this string "Breast  Sarcoma  Head and Neck  Prostate Malignancies"

Am sure its not space.
ASKER CERTIFIED SOLUTION
Avatar of Alpesh Patel
Alpesh Patel
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of lapucca
lapucca

ASKER

Hi gdm ,
I ran your code as follow and got this error message though. select personID, x.* from dbo.doctors cross apply dbo.SplitIt(treatment_areas,'  ') x thank you.
Msg 8152, Level 16, State 14, Line 2
String or binary data would be truncated.

Rad, it is double space.  I tested it with Patel's split function.  
Patel, I need to apply this to all rows in this table's column but don't know how to do that with just that function you provided even thought that does a good job of spliting one row.  
thanks everyone.

Avatar of lapucca

ASKER

If I use the split function by Patel then it does a quickly flash display of result then an error,
select d.personID, x.* from dbo.doctors d cross apply dbo.Split(d.treatment_areas,'  ') x
Msg 536, Level 16, State 2, Line 2
Invalid length parameter passed to the RIGHT function.
How to fix this in that Split funciton?  thank you.
>Msg 8152, Level 16, State 14, Line 2
>String or binary data would be truncated.


Try increasing the size of the data column.

instead of

data varchar(100)

try

data varchar (2000)
Avatar of lapucca

ASKER

dgm, I'm getting something like this which doesn't break the column by double space and create a row with same personID for each of the token.  Thank You.
PersonID              Data
947	Preventive Health Care  Cholesterol
2438	Maternal-Fetal Medicine  
292	Fine Needle Aspiration Biopsies  Cytologic and Surgical Pathology Diagnostic Services
2208	Infertility  Fertility Preservation  In Vitro Fertilization  Egg Freezing  Polycystic Ovarian Syndrome
489	Family Medicine  Primary Care  Women's Health  Common Chronic Illness  Chronic Diseases  Preventative Health  
655	Refractive Surgery  Cataract Surgical Techniques and Innovations
770	Gynecologic Malignancies  Complex Pelvic Surgery
505	Clinical Neurophysiology  Epilepsy  
240	General Gastroenterology  Colon Cancer Screening  Irritable Bowel Syndrome  Drug-Induced Liver Disease
727	Musculoskeletal Tumors  Joint Reconstruction  Musculoskeletal Infections
2435	Non-Surgical Treatment of Complex Vascular Disease  Peripheral Arterial Disease  Peripheral Vascular Disease  Carotid Disease  Aortic Aneurisms  Interventional Cardiology  Varicose Veins
1863	Cardiology  Cardiovascular Disease
1179	Neuroimmunology  Multiple Sclerosis  
2265	Genitourinary Pathology  Breast Pathology  Obstetrics and Gynecology Pathology  Dermatopathology  
2196	Medical and Surgical Retinal Diseases  Macular Degeneration  Diabetic Retinopathy  Vascular Diseases of the Eye  Uveitis
1134	Child Abuse

Open in new window

Avatar of lapucca

ASKER

Patel, is there a way to have the split function you provided to take double space as a delimiter?  Thank you.
create table table1 (id int, c varchar(5000))
insert into table1(id,c) select 1,'Breast  Sarcoma  Head and Neck  Prostate Malignancies'

create table table2(id int,c varchar(5000))
delete from table2
declare @id int,@v varchar(5000), @v1 varchar(5000), @nextone varchar(5000), @ci int

-- will insert from table1 to table2, splitting column "c" by double-space

select @id=id,@v=c from table1

set @v = @v + '  '
select @ci = charindex('  ',@v)
while @ci>0 begin
      set @nextone = substring(@v,1,@ci)
      insert into table2(id,c) select @id,@nextone
      select @v = substring(@v,@ci+2,len(@v)-2)
      select @ci = charindex('  ',@v)
end


select * from table2


result:
1      Breast
1      Sarcoma
1      Head and Neck
1      Prostate Malignancies
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of lapucca

ASKER

Hi Vad,
I ran your code as follow for about 20 minutes and then I stopped the query.  It created a table with a repeating id column for several hundred thousand rows.  The table 2 c column is empty for all rows.  The query looks about right to me except where does it fetch the next row?  It seems to be fetching the same 1st row and process it repeatedly.  Thank you.

create table table2(id int,c varchar(5000))
delete from table2
declare @id int,@v varchar(5000), @v1 varchar(5000), @nextone varchar(5000), @ci int

-- will insert from table1 to table2, splitting column "c" by double-space

select @id=personID,@v=treatment_areas from dbo.doctors

set @v = @v + '  '
select @ci = charindex('  ',@v)
while @ci>0 begin
      set @nextone = substring(@v,1,@ci)
      insert into table2(id,c) select @id,@nextone
      select @v = substring(@v,@ci+2,len(@v)-2)
      select @ci = charindex('  ',@v)
end


select * from table2

Open in new window

right, this was just an example, which extracted one line, parsed it, and put in the lines. To process many lines, create a cursor

declare c cursor local fast_forward for
select column1,column2 from dbo.doctors
fetch c into @id,@v
while @@fetch_status=0 begin
  <same parsing code as before>
  fetch c into @id,@v
end
close c
deallocate c

Avatar of lapucca

ASKER

Hi Vad,
Modified your code to this but got the following error:
Msg 156, Level 15, State 1, Line 21
Incorrect syntax near the keyword 'set'.

Error occurs at the following line in SSMS when I double click on the error
              set @v = substring(@v,@ci+2,len(@v)-2)
if exists(select 1 from sys.objects where object_ID=OBJECT_ID(N'table2') and (type=N'U'))
	drop table table2;

create table table2(id int,treatment varchar(5000))
declare @id int,@v varchar(5000), @v1 varchar(5000), @nextone varchar(5000), @ci int

declare c cursor local fast_forward for
select personID,treatment_areas from dbo.doctors
open c
fetch c into @id,@v
select @ci = charindex('  ',@v)
set @v=rtrim(@v);
set @v = @v + '  '
while @@fetch_status=0 begin

	while @ci>0 
	begin
		  set @nextone = substring(@v,1,@ci)
		  insert into table2(@id,@nextone)
		  set @v = substring(@v,@ci+2,len(@v)-2)
		  set @ci = charindex('  ',@v)
	End

	fetch c into @id,@v
	set @v=rtrim(@v);
	set @v = @v + '  '
end
close c
deallocate c

Open in new window

If anyone is interested, John Moden has a great article on a blazing fast SQL Server splitter using Common Table Expressions.

http://www.sqlservercentral.com/articles/Tally+Table/72993/

But not even the fastest TSQL-based parsing can beat the CLR-based parsing solutions.
List of columns in insert into table2 is missing. Anyway, I'm only giving the idea/direction/example, I hope you should be able to write the code for your real scenario, and amake it work.
Avatar of lapucca

ASKER

Vad,
I'm a .net developer.  I know the logic of how to code.  What I need help here is Tsql syntax and functions I don't know.  I've already have to modify your example code with missing code and incorrect logic.  It's just bad practice to post code that gives error.   have a look below and you might see what you left out in your code example.  I need help with the syntax and tsql language and not programming.  
My code below still not working completely because it only returns a very small set of records and not quite correct either  but please don't bother to help with your attitude.  I do not appreciate your continuous insinuation of my inability to program.
if exists(select 1 from sys.objects where object_ID=OBJECT_ID(N'table2') and (type=N'U'))
	drop table table2;

create table table2(id int,treatment varchar(5000))
declare @id int,@v varchar(5000), @v1 varchar(5000), @nextone varchar(5000), @ci int

declare c cursor local fast_forward for
select personID,treatment_areas from dbo.doctors
open c
fetch c into @id,@v
set @v=rtrim(@v);
set @v = @v + '  '
set @ci = charindex('  ',@v)
while @@fetch_status=0 and len(@v)>2 begin

	while @ci>0 
	begin
		  set @nextone = substring(@v,1,@ci)
		  insert into table2(id, treatment) values(@id,@nextone)
		  if(len(@v)-2-len(@nextone) >0)
			begin
			  set @v = substring(@v,@ci+2,len(@v)-2 - len(@nextone))
			  set @ci = charindex('  ',@v)
			end
		  else 
			  set @ci=0
	End

	fetch c into @id,@v
	set @v=rtrim(@v);
	set @v = @v + '  '
	set @ci = charindex('  ',@v)
end
close c
deallocate c

Open in new window

>> dgm, I'm getting something like this which doesn't break the column by double space and create a row with same personID for each of the token.  Thank You.
dqmq has already provided a solution in http:#35452137 Are you still looking for a solution?
Avatar of lapucca

ASKER

Hi modus,
I appreciate what you're trying to say but I do dis-agree with you on most of it.  I was only replying in the  same way that Vad has expresses.  He was implying that I expect me to do the code myself.  Well, I was but the point is that his example has missing logic and wrong syntax on every example he provided.  I've been using your service for the past 5,6 years.  Yes, sometimes the expert provides code that doesn't quite work or if they haven't checked the syntax they would mention so.  
In any case, I thought he was not providing me with answer that he barely bother putting any thought in  and then saying things like "I hope you should be able to write the code for your real scenario,.."  Saying I cannot write my own code.  Well, what's wrong with me saying the he's not putting much thought about what he put out here either.  
You can check my record for the past 5, 6 years that I am always most grateful for the help I get here.  Please do not even start accusing me without getting all the fact.  That is Not appreciated!!!
Avatar of lapucca

ASKER

In SSMS, the delimiter displays as double space but it turns out after trying different ASCII char, it's actually LF, char(10).  That's way all the stuff I tried didn't work.  Thanks everyone.
For the record:

> his example has missing logic and wrong syntax on every example he provided

http:#ID:35502476 - the posted code was direct copy/paste from SSMS. I just copied it once again and ran, and of course got the same result. So much for "wrong syntax". The result seems to be 1:1 to what was expected in the original question, the same 4 lines - so much for "wrong logic".

http:#35513409 was pseudo-code with obvious placeholder <same parsing code as before>.

Finally, "Saying I cannot write my own code" - I guess that's when I said "I hope you should be able to write the code for your real scenario, and make it work". Somewhat unexpected interpretation.