Solved

Sql remove html format

Posted on 2011-09-14
3
385 Views
Last Modified: 2012-05-12
I need to remove html format from text, for example bold, italcis etc. How do I do that in SQL 2008?
Or is it better to do it in C#?
0
Comment
Question by:johnkainn
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 
LVL 39

Assisted Solution

by:Pratima Pharande
Pratima Pharande earned 83 total points
ID: 36534942
CREATE FUNCTION [dbo].[udf_StripHTML]
(@HTMLText VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
WHILE @Start > 0
AND @End > 0
AND @Length > 0
BEGIN
SET @HTMLText = STUFF(@HTMLText,@Start,@Length,'')
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
END
RETURN LTRIM(RTRIM(@HTMLText))
END
GO


Test above function like this :

SELECT dbo.udf_StripHTML('<b>UDF at SQLAuthority.com </b><br><br><a href="http://www.SQLAuthority.com">SQLAuthority.com</a>')

Result Set:

UDF at SQLAuthority.com SQLAuthority.com

If you want to see this example in action click on Image. It will open large image.

refer
http://blog.sqlauthority.com/2007/06/16/sql-server-udf-user-defined-function-to-strip-html-parse-html-no-regular-expression/
0
 
LVL 9

Assisted Solution

by:mimran18
mimran18 earned 83 total points
ID: 36535108
Hi
   Here we go
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/ccbde8aa-68da-44c0-b9b2-71bd66707eee/
 
Drop Function [dbo].[UDf_HTMLTags]
Go
CREATE Function [dbo].[UDf_HTMLTags]
    (@HTML varchar(Max))
    Returns varchar(Max)
As

Begin
    Declare @Start int,
        @End int,
        @Length int

    While CharIndex('<', @HTML) > 0 And CharIndex('>', @HTML, CharIndex('<', @HTML)) > 0
        Begin
        Select @Start = CharIndex('<', @HTML), 
          @End = CharIndex('>', @HTML, CharIndex('<', @HTML))
        Select @Length = (@End - @Start) + 1
        If @Length > 0
            Begin
            Select @HTML = Stuff(@HTML, @Start, @Length, '')
            End
        End

    return @HTML
End

Go
Select [dbo].[UDf_HTMLTags] ('<b>UDF at SQLAuthority.com </b><br><br><a href="http://www.SQLAuthority.com">SQLAuthority.com</a>')

Open in new window

0
 
LVL 7

Accepted Solution

by:
Kishan Zunjare earned 84 total points
ID: 36547171
Instead of removing HTML from sql you can remove HTML through C#

The solution is quite simple:

1. Retrieve all the HTML tags using this pattern: <(.|\n)*?>
2. Replace them with an empty string and return the result

Here's a C# function that does this:

private string StripHTML(string htmlString)
{
    //This pattern Matches everything found inside html tags;
    //(.|\n) - > Look for any character or a new line
    // *?  -> 0 or more occurences, and make a non-greedy search meaning
    string pattern = @"<(.|\n)*?>";
    return  Regex.Replace(htmlString,pattern,string.Empty);
}


Or with just one line of code:

string stripped = Regex.Replace(textBox1.Text,@"<(.|\n)*?>",string.Empty);

This is an simple and powerful solution.

Hope this will work
0

Featured Post

Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…
Are you ready to implement Active Directory best practices without reading 300+ pages? You're in luck. In this webinar hosted by Skyport Systems, you gain insight into Microsoft's latest comprehensive guide, with tips on the best and easiest way…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question