Extracting Data from a String

Karl_mark
Karl_mark used Ask the Experts™
on
Given a string which looks like this:
'Workstation - Base Unit - Hard Drive'

How can I get the data into the format shown in the screenshot? In other words, split the text (which is delimited by ' - ') from the full string so that it is in separate columns?

I'm sure I could figure this out, but I've just spent several hours on another thorny SQL problem and need to speed up a bit!


 Target data
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
Will it always be 3 columns?
Try this:

SELECT field, LEFT(field,0,inStr(field,"-")) AS Cat1, SUBSTR(field,inStr(field,"-"),inStr(field,"-",2)) AS Cat2, RIGHT(field,inStr(field,"-",2)) As Cat3 FROM table

This probably will give you some syntax problem, as I don't remember now the exact syntax of these functions on SQL Server (maybe I'm giving you Access syntax, wich is similar but not exactly the same). However, I'm pretty sure you can solve your problem with this and ten minutes of googleing for syntax corrections.
Should you be charging more for IT Services?

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Expert of the Quarter 2010
Expert of the Year 2010
Commented:
This will support up to 4 columns.  It won't split after the 4th delimiter, 4th column will contain the remainder.

Usage:

select FULL_NAME,
  ltrim(rtrim(m.col1)) cat1,
  ltrim(rtrim(m.col1)) cat2,
  ltrim(rtrim(m.col1)) cat3 -- ignore col4 if it will never contain anything
from tbl
cross apply dbo.values2columns4(FULL_NAME, '-') m
CREATE function dbo.values2columns4
(
@values varchar(max),
@separator char(1)
) returns @res table (col1 varchar(max), col2 varchar(max), col3 varchar(max), col4 varchar(max))
as
begin
declare @value varchar(50), @col1 varchar(max), @col2 varchar(max), @col3 varchar(max), @col4 varchar(max)
declare @commapos int, @lastpos int
set @commapos = 0
select @lastpos = @commapos, @commapos = charindex(@separator, @values, @lastpos+1)
while @commapos > @lastpos and @col3 is null
begin
	select @value = substring(@values, @lastpos+1, @commapos-@lastpos-1)
	if nullif(@value,'') <> '' begin
		if @col1 is null set @col1 = ltrim(rtrim(@value))
		else if @col2 is null set @col2 = ltrim(rtrim(@value))
		else set @col3 = ltrim(rtrim(@value))
	end
	select @lastpos = @commapos, @commapos = charindex(@separator, @values, @lastpos+1)
end
if @lastpos+1 < len(@values) set @value = substring(@values, @lastpos+1, len(@values))
if nullif(@value,'') <> '' begin
	if @col1 is null set @col1 = ltrim(rtrim(@value))
	else if @col2 is null set @col2 = ltrim(rtrim(@value))
	else if @col3 is null set @col3 = ltrim(rtrim(@value))
	else set @col4 = ltrim(rtrim(@value))
end
insert @res values (@col1, @col2, @col3, @col4)
return
end
GO

Open in new window

Author

Commented:
No, it will vary. For instance, the string prior to the one shown reads:

Workstation - Base Unit

Whereas there may also be a fourth value in the text. Ideally I'd be looking for generic code which can determine how many columns would be required. Failing that, I'll just hard code it to a set maximum...

Expert of the Quarter 2010
Expert of the Year 2010

Commented:
If you want an arbitrary number of return columns, it will be a lot more tricky and will involve dynamic SQL to generate the columns, either as shown here (code in the question itself)

http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/Q_26558288.html

Or using split-column-to-rows + dynamic pivot

Code box will split a string into 2 columns of multiple rows (ID = column position, Item)

usage:
select t.*, m.id, m.one
from mytable t
outer apply dbo.multiRow(full_name, '-') m

You can then use the article
http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SQL-Server-2005/A_653-Dynamic-Pivot-Procedure-for-SQL-Server.html
Making use of the ID from this table as the PIVOT column.
CREATE function dbo.multiRow(@data varchar(max), @delims varchar(10))
returns table as return
with cte (id,one,rem)
as
(
select
	1,
	rtrim(ltrim(LEFT(@data, charindex(@delims,@data+@delims)-1))),
	substring(@data, charindex(@delims,@data+@delims)+LEN(@delims+'.')-1, LEN(@data))
where nullif(@data,'') is not null
union all
select
	id+1,
	rtrim(ltrim(LEFT(rem, charindex(@delims,rem+@delims)-1))),
	substring(rem, charindex(@delims,rem+@delims)+len(@delims+'.')-1, LEN(rem))
from cte
where nullif(rem,'') is not null
)
select id=row_number() over (order by id), one from cte where nullif(one,'') is not null
GO

Open in new window

Author

Commented:
Thanks Cyberkiwi. The first solution is the easiest and does what I want. If they want more in the future I'll just update it!

Author

Commented:
Used cyberwiki''s first piece of code. Thanks also to bardobrave for getting me thinking!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial