Solved

Stripping RTF characters out of a varchar(max) sql server database field

Posted on 2012-03-14
16
3,361 Views
Last Modified: 2012-03-23
I have a varchar(max) field in my SQL Server database table.  My field has RTF characters that I need to strip out so that I can display the text only.

My SQL Server 2008 is on a windows 2003 server.

I created the function below which I got from a post named RTF to Text conversion on this website.

create function dbo.RTF2TXT(@in varchar(8000)) RETURNS  varchar(8000) AS
BEGIN

DECLARE @object int
DECLARE @hr int
DECLARE @out varchar(8000)

-- Create an object that points to the SQL Server
EXEC @hr = sp_OACreate 'RICHTEXT.RichtextCtrl', @object OUT
EXEC @hr = sp_OASetProperty @object, 'TextRTF', @in
EXEC @hr = sp_OAGetProperty @object, 'Text', @out OUT
EXEC @hr = sp_OADestroy @object
return @out

END
GO


I also downloaded Richtx32.ocx and registered it using
regsvr32 c:\windows\system32\Richtx32.ocx  (got a message that it registered successfully).

However when I test the function using

select dbo.RTF2TXT('{\rtf1\ansi\ansicpg1252\uc1 aaa}')

I get a null.

Can you please tell me how to get this to work.

Thanks,
SISDalia
0
Comment
Question by:SISDalia
  • 7
  • 6
16 Comments
 
LVL 15

Expert Comment

by:Simon Ball
ID: 37723879
can you post a sample of one data item too?
0
 

Author Comment

by:SISDalia
ID: 37725354
{\rtf1\ansi\deff0\deftab720{\fonttbl{\f0\fswiss MS Sans Serif;}{\f1\froman\fcharset2 Symbol;}{\f2\fswiss MS Sans Serif;}{\f3\froman Times New Roman;}{\f4\froman Times New Roman;}{\f5\froman\fprq2 Times New Roman;}} {\colortbl\red0\green0\blue0;} \deflang1033\pard\plain\f5\fs24 I am pleased to offer this Medical Student Performance Evaluation for Erica Lubliner as a candidate for your post-graduate training program. Erica was raised in Anaheim, California. Her father died during her early childhood leaving her mother as the primary provider. Her mother, an immigrant from Mexico, faced many challenges as she tried to overcome financial, cultural and linguistic barriers while working as a factory worker. Erica served as an interpreter for her family and neighbors when they faced barriers to healthcare. Fortunately, her experiences as a first-generation student did not dissuade her from obtaining her educational goals and she was mentored by teachers who provided opportunities for her leadership and personal development. In high school, Erica excelled academically. She also held several leadership roles as a class officer and in athletics as captain of the varsity volleyball team, all while working to contribute to her family's household expenses. She is the first in her family to attend college. \par \par Erica matriculated at the University of California at Los Angeles with a double major in History and Women's Studies. She actively advocated for underserved students and became involved with the Scholarship Association for English as a Second Language Students (SAESL) during college. As a former ESL student, she identified with the challenges faced by students learning English for the first time. Erica was a College Advisor at Wilson High School in East Los Angeles, where she assisted and mentored students. She held several leadership positions and was the co-chair of the Chicanos for Creative Medicine organization and the undergraduate representative for the Latino Medical Student Association. \par \par Erica has continued to demonstrate excellent leadership abilities during medical school. She has actively participated in and organized activities for the Latino Medical Student Association, including being a Co-Chair, developing a Latino Health Lecture series, being a member of the board for the SAESL, and currently organizing a Latina women's professional group. One of her most cherished activities has been as editor of the BEAT Art and Literary Magazine. Erica has also continued to devote herself to community service by volunteering at health fairs, mentoring high school and premedical students and raising funds for scholarships. \par \par Erica's senior research thesis is an investigation of female medical student perspectives on how gendered interactions affect their medical education, specifically women of color. \par \plain\f3\fs24 \par }
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37734839
If you are using this data in ASP.NET, that would be a better place to handle this, before displaying.  You might be able to use a System.Windows.Forms.RichTextBox, as an unsited worker.  Set the Rtf property to the rich-text from the database, and then get the plain text from the Text property.
0
 

Author Comment

by:SISDalia
ID: 37738340
Unfortunately that won't work as this is a web application not a windows application.

Is their an equivalent control on an asp.net page? If there is what is it?

Thanks,
SISDalia
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37739153
You can add a reference to System.Windows.Forms to the web site, and use the RichTextBox control, since it is not being added to any page.  It is basically a worker class at that point.  I have used the System.Windows.Forms.TextBox control in a similar manner in the past.
0
 

Author Comment

by:SISDalia
ID: 37739502
Thanks, I added a reference to the system.windows.forms and an imports system.windows.forms to the top of my asp.net page.  How do I get the RichTextBox control to show up in my toolbox? or how do i reference it?
0
 
LVL 96

Accepted Solution

by:
Bob Learned earned 500 total points
ID: 37739523
You can add a Imports System.Windows.Forms or using System.Windows.Forms; to your class, and add code like this C# example (untested code typed into comment block):

using System.Windows.Forms;

namespace My.WebSite
{

   public class RichTextService
   {

       public static string StripOutRichText(string richText)
       {
             RichTextBox richTextBox = new RichTextBox();
             richTextBox.Rtf = richText;
             return richTextBox.Text;
       }

    }

}

Open in new window

0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:SISDalia
ID: 37750010
Just another followup question to this:  After the rtf gets stripped out and the new clean field is saved to the database, the next time the load function runs I get this error,
"ERROR: File format is not valid".  I suspect that when it is passed a field with no rtfs that it returns this error. How can I check with an if statement to see if the string being passed has rtf characters in it or not?

Thanks,
SISDalia
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37752130
" next time the load function runs I get this error"

Can you explain what that means in more detail?

There is a format to RTF, and you should be able to look for patterns, like {\rtf1\.
0
 

Author Comment

by:SISDalia
ID: 37753345
The RTF characters get stripped out of the field.

So next time my application tries to load that same record,  the new data has no RTF characters in it.

The next time the form loads I automatically call the StripOutRichText function again because other records will have RTF characters...but for the ones that already got cleaned up the function returns that error.  

So the question is: how can I add an if statement that checks if there is RTF formatting or not? or how do I deal with the records where the RTF characters have been already stripped out.

Thanks,
SISDalia

Function StripOutRichText(ByVal strRTF As String)

   Dim RichTextBox As New RichTextBox


   RichTextBox.Rtf = strRTF  -- errors on this line for data that has no RTF.

   Return RichTextBox.Text


End Function
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37753960
That would be something like this, I believe:

Public Shared Function StripOutRichText(ByVal inputText As String) As String
   If inputText.StartsWith("{\rtf") Then
      Dim worker As New RichTextBox()
      worker.Rtf = inputText
      Return worker.Text
   End If
   Return inputText
End Function

Open in new window

0
 

Author Comment

by:SISDalia
ID: 37757839
Thank you that fixed the problem.  

I have another related question: is there a way to just keep the \par markings?
or is it all or none?

Should I enter this as a new question?
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37757967
That would normally be a new question, but it is all or nothing in this case.
0
 

Author Comment

by:SISDalia
ID: 37758081
I just added the new question. Would you please take a look at it.  I appreciate your help.
Thanks,
Dalia
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Viewers will learn how the fundamental information of how to create a table.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

24 Experts available now in Live!

Get 1:1 Help Now