Solved

MSSQL Parse XML Values Into SQL Fields

Posted on 2016-08-09
16
98 Views
Last Modified: 2016-08-17
Hello...

We inherited a small project in MSSQL.  The database has a field in an SQL table called "Address", which, literally, has the following sample values stored in it (6 rows follow as an example) :

Wilson Photography<br>333 54th Ave. North<br>Nashville TN 37209
Wyatt Archaelogical Research<br>4313 Green Hills Rd<br>Nashville TN 37174-5132
Ed Perdue<br>311 Hobbs Dr<br>White House TN 37188
NULL
NULL
Maury Wahtera<br>407 Normandy Circle<br>Nashville TN 37209

What we need...is to be able to have these values...parsed out into "proper" fields in a VIEW, as follows:
NAME, ADDRESS, CITY, STATE, ZIPCODE

...wherever the field value is NOT NULL (obviously :-)

We see that those values are separated by a "<br>" string, but we don't know the syntax, to parse this (XML?) value out.  Any help would be very much appreciated.  Thank you!...Mark
0
Comment
Question by:datatechcorp
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 7
16 Comments
 
LVL 41

Expert Comment

by:Sharath
ID: 41749612
What is the datatype of this Address  field? XML or VARCHAR?
0
 

Author Comment

by:datatechcorp
ID: 41749621
Hi Sharath...

It's an NVARCHAR(1000), and it allows NULLS.  Thanks!...Mark
0
 
LVL 12

Expert Comment

by:funwithdotnet
ID: 41749637
Parsing strings in T-SQL is kind of a pita, unless you are using a version 13 (SQL Server 2016) database, which you could use just to do the parsing.

Using a v13 database, you can:
select * from STRING_SPLIT(replace('Wilson Photography<br>333 54th Ave. North<br>Nashville TN 37209', '<br>', '~') , '~' )  

Open in new window

That will return three rows:
Wilson Photography
333 54th Ave. North
Nashville TN 37209
0
Edgartown IT Case Study

Learn about Edgartown's quest to ensure the safety and security of the entire town's employee and citizen data. Read the case study!

 
LVL 41

Expert Comment

by:Sharath
ID: 41749802
Can I assume that the max no.of columns expected are 5?
0
 

Author Comment

by:datatechcorp
ID: 41749844
Yes, correct, 5-Columns max.  Unfortunately, we are stuck with the MSSQL 2012 Iteration, and cannot go beyond that (sigh).  Thoughts?  Thanks!...Mark
0
 
LVL 41

Expert Comment

by:Sharath
ID: 41750953
Check this.
declare @table table (col varchar(100))
insert @table values 
('Wilson Photography<br>333 54th Ave. North<br>Nashville TN 37209'),
('Wyatt Archaelogical Research<br>4313 Green Hills Rd<br>Nashville TN 37174-5132'),
('Ed Perdue<br>311 Hobbs Dr<br>White House TN 37188'),
(NULL),
(NULL),
('Maury Wahtera<br>407 Normandy Circle<br>Nashville TN 37209')

;with cte1 as (select *,charindex('<br>',col) idx1, charindex('<br>', col, charindex('<br>',col)+1) idx2 from @table),
      cte2 as (select col,left(col,idx1-1) Name, substring(col,idx1+4,idx2-idx1-4) Address,substring(col,idx2+4,len(col)) str3 from cte1),
      cte3 as (select *,reverse(str3) rev_str3 from cte2),
	  cte4 as (select *,charindex(' ',rev_str3) idx1, charindex(' ',rev_str3,charindex(' ',rev_str3)+1) idx2 from cte3)
select Name, Address,
       reverse(rtrim(ltrim(substring(rev_str3,idx2,len(rev_str3))))) City,
	   reverse(rtrim(ltrim(substring(rev_str3,idx1,idx2-idx1)))) State,
	   reverse(left(rev_str3,idx1-1)) ZipCode
  from cte4
/*
Name	Address	City	State	ZipCode
Wilson Photography	333 54th Ave. North	Nashville	TN	37209
Wyatt Archaelogical Research	4313 Green Hills Rd	Nashville	TN	37174-5132
Ed Perdue	311 Hobbs Dr	White House	TN	37188
NULL	NULL	NULL	NULL	NULL
NULL	NULL	NULL	NULL	NULL
Maury Wahtera	407 Normandy Circle	Nashville	TN	37209
*/
 

Open in new window

0
 

Author Comment

by:datatechcorp
ID: 41750982
Hi Sharath...

Thank you for responding.  If my Existing TableName is "NSCustomer"...and that FieldName that we want to parse into our select statement is "NSCustomer.Address"...how should this query syntax look?  Please help...and I apologize for our ignorance :-)

Thanks!...Mark
0
 
LVL 41

Accepted Solution

by:
Sharath earned 500 total points
ID: 41751094
;with cte1 as (select *,charindex('<br>',Address) idx1, charindex('<br>', Address, charindex('<br>',Address)+1) idx2 from NSCustomer),
      cte2 as (select Address original_Address,left(Address,idx1-1) Name, substring(Address,idx1+4,idx2-idx1-4) Address,substring(Address,idx2+4,len(Address)) str3 from cte1),
      cte3 as (select *,reverse(str3) rev_str3 from cte2),
      cte4 as (select *,charindex(' ',rev_str3) idx1, charindex(' ',rev_str3,charindex(' ',rev_str3)+1) idx2 from cte3)
select Name, Address,
       reverse(rtrim(ltrim(substring(rev_str3,idx2,len(rev_str3))))) City,
       reverse(rtrim(ltrim(substring(rev_str3,idx1,idx2-idx1)))) State,
       reverse(left(rev_str3,idx1-1)) ZipCode
  from cte4

Open in new window

0
 

Author Comment

by:datatechcorp
ID: 41751296
Hi Sharath...

That seemed to work beautifully...until it got to the "umpteenth" record in the NSCustomer table...and then we got the following error:

Invalid length parameter passed to the LEFT or SUBSTRING function.

How can we avoid this?  Any thoughts here?  The value of the FIELD is as follows, in the "next" record in the Table...and we didn't see anything "squirrely" with the data:

ArNells Art Photography<br>864 Joseph Ave.<br>Nashville TN
0
 
LVL 41

Expert Comment

by:Sharath
ID: 41751324
You don't have ZipCode for this value. Do you have any other part (City/State) missing for any row?
Also, what is your SQL version?
0
 

Author Comment

by:datatechcorp
ID: 41751405
Hi Sharath...

MSSQL 2012.  And, yes, it's very conceivable...that ANY part of the parsed field values...may be "missing".  It depends what the doggone Salespersons keyed in to the Customer's account record.  There are over 45,000 records, so, yes, certainly, I'm sure "parts" may be missing.  UUGH...I know,

Thoughts?  And your assistance has been *very much* appreciated...Thanks!...Mark
0
 
LVL 41

Expert Comment

by:Sharath
ID: 41752661
Is it safe assumption that the STATE is always a 2-character string?
Are these assumptions fair?
  • If a 2-character string is missing, that means STATE part is missing.
  • If numerical string is missing, ZipCode is missing.
0
 

Author Comment

by:datatechcorp
ID: 41752676
Hi Sharath...

Perhaps.  Let me review the data...I should have an answer within the hour, OK?  Thanks!...Mark
0
 

Author Comment

by:datatechcorp
ID: 41753170
Hi Sharath...

OK...sooooo...the data is kind of...all over the place...where "CITY", "STATE", and/or "ZIPCODE" are concerned.  That's because (a) in some cases the STATE was left out of the data entry, (b) in some cases there are International Addresses and the value of 'Canada' or 'Australia' exists, (c) in some cases there are NO ZipCodes at all...and so forth.

For the *most* part, they are all "in line" with the structure of the fields we require...but probably a good 15-20% or so of the data is 'whacky'.

I think as long as the data gets parsed, somehow, someway, we can do a "data scrubbing" exercise "after the fact".  Seems to ALWAYS be the case, where "Addresses" are concerned, in a Customer Master table...make sense?

So, how should we proceed?  Please let me know asap...and, again, I very much appreciate the assistance...Thanks!...Mark
0
 
LVL 41

Expert Comment

by:Sharath
ID: 41755202
Can you post some examples for international address and how do you want to parse?
0
 

Author Comment

by:datatechcorp
ID: 41760015
Hi Sharath...

I've been busy doing some "data scrubbing"...and I believe we're all good for now.  Thank you sooooo much for all your assistance.  I'll let you know if I need anything further.  I'm awarding you all the points.

Again, Thanks!...Mark
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Composite queries are used to retrieve the results from joining multiple queries after applying any filters. UNION, INTERSECT, MINUS, and UNION ALL are some of the operators used to get certain desired results.‚Äč
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question