Link to home
Start Free TrialLog in
Avatar of MariaHalt
MariaHaltFlag for United States of America

asked on

Parsing delimited data into their own fields

I have a table with a text field that contains data that has lot of dots in it.  The number of dots vary.  I parse the field into several fields by altering the table based on the number of dots I find plus 1.  CharacterCount is just a scalar function.  There must be a better, simpler way than the dynamic sql I'm using, if there is, please share it with me.  I've included a sample set of data to test with.  Thanks.
use tempdb

go

CREATE FUNCTION [dbo].[CharacterCount] ( @Input VARCHAR(1000), @Character CHAR(1) )
RETURNS INT
BEGIN

RETURN (LEN(@Input) - LEN(REPLACE(@Input, @Character, '')))

END

go

IF EXISTS (SELECT * FROM tempdb..sysobjects WHERE name = 'mh_table' AND type = 'U' AND uid = 1)
DROP TABLE tempdb.dbo.mh_table

go

CREATE TABLE tempdb.dbo.mh_table (id int not null, descr varchar(255) null, copy_of_descr varchar(255))

go

INSERT INTO tempdb.dbo.mh_table VALUES (1000, 'abc', 'abc')
INSERT INTO tempdb.dbo.mh_table VALUES (1001, 'def', 'def')
INSERT INTO tempdb.dbo.mh_table VALUES (1002, 'ghi', 'ghi')
INSERT INTO tempdb.dbo.mh_table VALUES (9, 'abc.level8', 'abc.level8')
INSERT INTO tempdb.dbo.mh_table VALUES (10, 'abc.level9', 'abc.level9')
INSERT INTO tempdb.dbo.mh_table VALUES (11, 'abc.level10', 'abc.level10')
INSERT INTO tempdb.dbo.mh_table VALUES (12, 'abc.level11', 'abc.level11')
INSERT INTO tempdb.dbo.mh_table VALUES (13, 'abc.level12', 'abc.level12')

INSERT INTO tempdb.dbo.mh_table VALUES (61, 'def.grade1', 'def.grade1')
INSERT INTO tempdb.dbo.mh_table VALUES (62, 'def.grade3', 'def.grade3')
INSERT INTO tempdb.dbo.mh_table VALUES (63, 'def.grade5', 'def.grade5')
INSERT INTO tempdb.dbo.mh_table VALUES (644, 'def.grade5', 'def.grade5')

INSERT INTO tempdb.dbo.mh_table VALUES (789, 'ghi.topicA.level10.a', 'ghi.topicA.level10.a')
INSERT INTO tempdb.dbo.mh_table VALUES (792, 'ghi.topicA.level10.b', 'ghi.topicA.level10.b')
INSERT INTO tempdb.dbo.mh_table VALUES (799, 'ghi.topicA.level10.c', 'ghi.topicA.level10.c')
INSERT INTO tempdb.dbo.mh_table VALUES (800, 'ghi.topicA.level10.x', 'ghi.topicA.level10.x')

go

SET QUOTED_IDENTIFIER OFF

DECLARE @SQL VARCHAR(500), 
		@FIELDS_TO_ADD INT,
		@LOOP INT

--Determine how many fields will need to be added to the table 
SELECT @FIELDS_TO_ADD = MAX(tempdb.dbo.CharacterCount(descr, '.')) + 1 FROM tempdb.dbo.mh_table

--Loop 1/3
--Alter table, add fields to parse complete number out
SELECT @LOOP = 1

WHILE @LOOP <= @FIELDS_TO_ADD
	BEGIN	
		SELECT @SQL = ""	
		SELECT @SQL = "ALTER TABLE tempdb.dbo.mh_table ADD CN" + CONVERT(VARCHAR, @LOOP) + " NVARCHAR(25) NULL " 
		
		EXEC (@SQL)
		
		--Try 1 of 2
		SELECT @SQL = ""
		SELECT @SQL = "UPDATE tempdb.dbo.mh_table " + 
                      "SET copy_of_descr = SUBSTRING(copy_of_descr, CHARINDEX('.', copy_of_descr, 1) + 1, LEN(copy_of_descr) - CHARINDEX('.', copy_of_descr, 1)), " + 
                      "    CN" + CONVERT(VARCHAR, @LOOP) + " = SUBSTRING(copy_of_descr, 1, CHARINDEX('.', copy_of_descr, 1) - 1) " +
                      "WHERE CHARINDEX('.', copy_of_descr, 1) - 1 >= 0 " 
                      
		--PRINT @SQL
        EXEC (@SQL)
        
        --Try 2 0f 2
        SELECT @SQL = ""
		SELECT @SQL = "UPDATE tempdb.dbo.mh_table " + 
                      "SET copy_of_descr = '', " +
                      "    CN" + CONVERT(VARCHAR, @LOOP) + " = copy_of_descr " +
                      "WHERE CHARINDEX('.', copy_of_descr, 1) = 0 " + 
                      "AND CN" + CONVERT(VARCHAR, @LOOP) + " IS NULL "
                      
		--PRINT @SQL
        EXEC (@SQL)
		
		SELECT @LOOP = @LOOP + 1
	END	

SELECT * FROM tempdb.dbo.mh_table

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of cyberkiwi
cyberkiwi
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of MariaHalt

ASKER

What's the split into rows function?
This question is something similar to what you are doing, see my comment there.

https://www.experts-exchange.com/questions/26559811/Extracting-Data-from-a-String.html#33956075
And how do the rows become fields, sorry, I've never used pivot before.
Is your data to be stored permanently into fields?
Do you want to go through splitting each time data is queried?

Have a read of the article first and let me know if you have specific problems with it.
No...the splitting is really there so that I can fix the pieces of data that contain both letters and numbers, so it can later be sorted alphabetically.  For example, abc.G5level10, comes before abc.G5level8.  So I'm splitting them up into 'abc' and 'G5level8', then using a function, inserting a ~ between the letters and numbers to yield 'G~5~level~8', just so that I can split them up again into their own fields, 'G', '5', 'level', '8' and then find the max character lengh on the field so that I can stick a leading zero in front of the numbers so they will sort correctly.  Once I got the sort right, I insert the records into a table that uses an identity field to keep the original data in order.
For example, abc.G05level010, will then come after, abc.G05level008, alphabetically.
Well, how about using "magic numbers" to delimit dynamically in a single query...


select id, descr, substring(descr,n,charindex('.',descr+'.',n+1)-n) as component
from tempdb.dbo.mh_table
cross join (select number n from master..spt_values where type = 'P') numbers
where substring('.'+descr,n,1) = '.'
and n < len(descr)


have a read of the article : https://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/A_1221-Fun-with-MS-SQL-spt-values-for-delimited-strings-and-virtual-calendars.html 
As for replacing the numerics that will be fun...

How big is the table ? If you can give a good indication of size (ie count of distinct descriptions) then maybe a translation table might be the go.
Time ran out, I just stuck with the dynamic sql I originally had.  Thanks for responding.
Glad I didnt post my number conversion routine then *laughing*
Sorry Mark, If it comes up again, I'll give your solution a try.