Strip Html tags

In a field(say htmlfield varchar(200)) in the table(mytable)  i have some data in the format shown below.

<b>Label A</b><br>Some Label<br><b>Label B</b><br>Need This value<br><b>End Look</b><br>The value - between this is what i want<br>

i need a query to extract the value between the second  br tags thats
"Need This value"
and the third br tags that
"The value - between this is what i want."

What am actaully trying to achieve is to display only the relevant data to the users in the excel sheet ( Am using the new database query option available in excel to connect to SQL Server 2000).
I cannot change the way the data is inserted. Neither can i do any datatype changes.

Thanks for any help.

I have seen http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=197&lngWId=5
but i dont think this will work when i use it from excel.
LVL 33
sajuksAsked:
Who is Participating?
 
LowfatspreadConnect With a Mentor Commented:
sorry for delay

 select charindex('<BR>',htmlfield,charindex('<BR>',htmlfield,charindex('<BR>',htmlfield)+4)+4) + 4 as pos1,htmlfield
   from #mytable


the charindex function looks for the first parameter within the second parameter and has an optional 3 parameter which specifies the
position in the second parameter at which to start the search...

so
charindex('<BR>',Htmlfield)  
searches Htmlfield for a <BR> and returns the position of the first occurrence

and charindex('<BR>',HTMLFIELD,charindex('<BR>',HTMLFIELD)+4)

the nested charindex returns the position at which the first '<BR>' is found
since '<BR>' is 4 characters long
we add 4 to that position
and thats where the outer charindex begins its search for a '<BR>'
and returns the position of the second '<BR>' in the HTMLFIELD column

so
charindex('<BR>',htmlfield,charindex('<BR>',htmlfield,charindex('<BR>',htmlfield)+4)+4) + 4
finds the third <BR>  and adds 4 to give the start position of theText in the 3 set of <BR>'s in HTMLFIELD



the outer query
Select case when htmlfield like '%<BR>%<BR>%<BR>%<BR>%' then substring(htmlfield,pos1,charindex('<BR>',htmlfield,pos1)-pos1) else HTMLFIELD END
  from (


the LIKE Clause confirms that at least 4 <BR>'s  are present in HTMLFIELD
so that the SUBSTRING  can use the position of the fourth <BR> to determine the length of the text between the  3rd and 4th <BR>'s

SO  substring(htmlfield,pos1,charindex('<BR>',htmlfield,pos1)-pos1)

gets the text starting at POS1 which was determined in the inner subquery  
and then uses charindex to calculate the position of the 4th <BR> in HTLMFIELD (by starting at the pos1 position)
the subtraction of pos1 from this position give the length of the text

hth
 



0
 
sajuksAuthor Commented:
And please no links only a valid and working suggestion.
0
 
LowfatspreadCommented:
like this

Select substring(htmlfield,pos1,charindex('<BR>',htmlfield,pos1)-pos1)
  from (

   select charindex('<BR>',htmlfield,charindex('<BR>',htmlfield,charindex('<BR>',htmlfield)+4)+4) + 4 as pos1,htmlfield
   from #mytable

) as y
0
Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

 
sajuksAuthor Commented:
am getting the error
Invalid length parameter passed to the substring function.
0
 
LowfatspreadCommented:
Select case when htmlfield like '%<BR>%<BR>%<BR>%<BR>%' then substring(htmlfield,pos1,charindex('<BR>',htmlfield,pos1)-pos1) else HTMLFIELD END
  from (

   select charindex('<BR>',htmlfield,charindex('<BR>',htmlfield,charindex('<BR>',htmlfield)+4)+4) + 4 as pos1,htmlfield
   from #mytable

) as y
0
 
LowfatspreadCommented:
does all the data have at least 3 sets of <BR>?

what do you want to happen when it doesn't?

is <BR>   valid without a <\BR>?
(my solution would change slightly if a <\BR> was valid)

hth

 
0
 
sajuksAuthor Commented:
your query
Select case when htmlfield like '%<BR>%<BR>%<BR>%<BR>%' then substring(htmlfield,pos1,charindex('<BR>',htmlfield,pos1)-pos1) else HTMLFIELD END
  from (

   select charindex('<BR>',htmlfield,charindex('<BR>',htmlfield,charindex('<BR>',htmlfield)+4)+4) + 4 as pos1,htmlfield
   from #mytable

) as y

works perfectly in the respect that it retireves the first set of data that i want,
How do i get the second part of the data. ??
So in the first example that i gave it is able to retrieve  "Need This value"
now i want the data  "The value - between this is what i want."

>>>does all the data have at least 3 sets of <BR>?
yes that validation occurs so there will always be three sets of that

>>is <BR>   valid without a <\BR>? (my solution would change slightly if a <\BR> was valid)
You can take the format that i gave in my first post as final. There will be no changes in that.

Sorry for not replying earlier but no net access. Many thanks for your help till now.


0
 
sajuksAuthor Commented:
Lowfatspread, did u get any time to solve the query that i'd posted. ? Thanks for your time and my apologies for intruding into ur work.
0
 
sajuksAuthor Commented:
I solved the query using your example...but  can you help me in understanding what you've written in ENglish/psuedo code.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.