Link to home
Start Free TrialLog in
Avatar of roddios
roddios

asked on

TSQL Removing HTML tags from a field

Hello Experts,
I'm not sure if this is possible but i'm trying to read a text field into a temporary table and get rid of all html tags in it.

select substring(descriprion,1,200) into #temp  from myTable

I want to get rid of html tags that might exist in that column. the problem is I can not simply use Replace() and have it hardcoded since it may contain different HTML (e.g. <font color=red size=20> my description </font>)


I'm pretty much lookin for suggestions to achieve this

Thanks in advance
Rod
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

>I'm not sure if this is possible but i'm trying to read a text field into a temporary table and get rid of all html tags in it.
TSQL is not really the best idea to do this job, I would rather use some other programming/scripting language, using the MSXML documents objects to get rid of the html tags...
ASKER CERTIFIED SOLUTION
Avatar of g_johnson
g_johnson
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of roddios
roddios

ASKER

i would rather do it myself in vb.net but that's what i was asked to do.

Thanks g_johnson, didn't know those functions exist! (CHARINDEX and PATINDEX)

Here's the code if somebody's interested. it's a dumb code but enough for my purposes

CREATE FUNCTION dbo.removeHTMLTag
(
@myStr VARCHAR(2000) =''
)
RETURNS varchar(2000)
AS  
BEGIN
DECLARE @result VARCHAR(2000)
DECLARE @index INT
DECLARE @leftPos INT
DECLARE @rightPos INT
DECLARE @toBeRemovedStr VARCHAR(2000)
SET @index = patindex('%<[A-Z,/]%>%', @myStr)
WHILE (@index <> 0)
BEGIN
set @leftPos = @index
set @rightPos = charindex('>',subString(@myStr,@index, len(@myStr)))
set @toBeRemovedStr = substring(@myStr,@leftPos,@rightPos)
set @myStr = replace(@myStr,@toBeRemovedStr,'')
SET @index = patindex('%<[A-Z,/]%>%', @myStr)
END
select @result = @myStr
return ltrim(rtrim(@result))
END
glad I could help