Solved

extract text from rtf

Posted on 2014-01-10
21
912 Views
Last Modified: 2014-04-08
Hi experts

Is there a way of extracting the actual text component from rtf format as shown below:

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil Times New Roman;}{\f1\fnil\fcharset0 Times New Roman;}}
{\colortbl ;\red0\green0\blue0;}
\viewkind4\uc1\pard\cf1\f0\fs22\par
\cf0\lang2057\b\f1\fs24\par
\par
2nd Period - (5 minutes)\cf1\lang1033\b0\f0\fs22\par
}

Open in new window


in this example, I would like to extract '2nd Period - (5 Minutes)'
0
Comment
Question by:andymacf
  • 9
  • 6
  • 3
  • +2
21 Comments
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39771001
You might want to try the PlainText() function.

It has been a while since I've used the Rich Text format in Access database, but that does not look like the format I remember.  Where is this data coming from?  Was it created in Access?
0
 
LVL 7

Author Comment

by:andymacf
ID: 39771072
Hi Dale

The data is stored in a SQL DB, but I use access Linked Table Manager to run queries/reports and i need to extract the text to use as field identifiers in queries and reports

Thanks
Andy
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39771150
I'm just trying to figure out where those codes came from?

Is that cut from some Word or Internet page?  It does not look similar to the Rich Text markup that Access creates when you create an RTF memo field and format data.  Because of that, I seriously doubt that the PlainText( ) function in Access will strip out all of the extraneous codes.
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39771176
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39771188
You are going to need to figure out how to interpret the tags.  Microsoft publishes the RTF spec.  I don't have the document handy but I downloaded it a few years ago when I had to create a RTF formatted document and I actually saw a reference to it recently but I don't have time to go searching.  

Reading an RTF document will be more difficult but essentially, you need to use instr() to find the tag that starts the text and then instr() again to find the tag that ends the text.  Then you would have to find any tags in that part of the document that were used to apply formatting and get rid of them.  Good Luck.
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39771196
Gustav,  

Great info; added that to my personal knowledgebase.

What reference is required for that Rich Text Control?
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 39773074
I have no idea whether this is truly RTF compliant.

If Gustav's suggestion does not work, please let us know, and describe in full how the encoding is supposed to work.  I may be able to gin up a regular expression for it.
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39773669
Hi,
I found a copy of the RTF specification if you need it.
Word-2002-RTF-Specification-Fina.doc
0
 
LVL 7

Author Comment

by:andymacf
ID: 39773916
Thank you everyone for your comments, other events took priority on Friday, will get back to this next week.

Andy
0
 
LVL 7

Author Comment

by:andymacf
ID: 39778870
Hi Gustav

How do I know if I have access to the RichText control?

Thanks
Andy
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39778901
Try running the code.

You probably needs to install and register the ActiveX control:

http://stackoverflow.com/questions/6335546/register-richtx32-ocx-in-windows-7-64-bit

I believe you can download it here:

http://www.ocxdump.com/download-ocx-files_new.php/ocxfiles/R/RICHTX32.OCX/6.01.9782/download.html

/gustav
0
 
LVL 7

Author Comment

by:andymacf
ID: 39778932
Thanks gustav

Do I need to call the code from within my query, can you enlighten me please?

Andy
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39778969
Yes. You could use the function below in an expression:

SomeFieldTxt: ConvVarRTFtoText([SomeFieldRtf])

After running the query you can clear the OCX from memory:

        ' Reset call of Rich Text ActiveXBox in query.
        Call ConvRTFtoText(vbNullString, True)

Public Function ConvVarRTFtoText( _
  ByVal varRTF As Variant, _
  Optional booSingleRun As Boolean) _
  As String

' Converts, if possible, RTF formatted variant to plain text using
' RTF ActiveX control.
' Null as varRTF returns zero length string.
' When finished, Rich Text object is removed if booSingleRun is True.

  Dim strRTF  As String
  Dim strTmp  As String
  
  On Error GoTo Err_ConvVarRTFtoText
  
  If Not IsNull(varRTF) Then
    strTmp = CStr(varRTF)
    strRTF = ConvRTFtoText(strTmp, booSingleRun)
  End If

  ConvVarRTFtoText = strRTF

Exit_ConvVarRTFtoText:
  Exit Function

Err_ConvVarRTFtoText:
  Resume Exit_ConvVarRTFtoText

End Function

Open in new window

/gustav
0
 
LVL 7

Author Comment

by:andymacf
ID: 39778973
Tried that, get a 'Undefined function 'ConvVarRTFtoText' in expression' error

Andy
0
 
LVL 7

Author Comment

by:andymacf
ID: 39778974
Realised that you had added Var into the title, removed and got the following error:

Error 429. ActiveX compenent can't create object

Andy
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39778981
ConvVarRTFtoText is the helper function in my message above.

The error tells that the component isn't registered. Study the previously posted link or bing/google on: Register RichTx32.ocx

/gustav
0
 
LVL 7

Author Comment

by:andymacf
ID: 39779002
Hi gustav

I will need IT dept to download file so will let you knwo how I get on.

Thanks
Andy
0
 
LVL 49

Accepted Solution

by:
Gustav Brock earned 300 total points
ID: 39779032
Thanks. Good luck!

/gustav
0
 
LVL 7

Author Closing Comment

by:andymacf
ID: 39782410
Hi gustav

Job done and works really well, thank you.

Save me a lot of pain :)

Cheers
Andy
0
 
LVL 7

Author Comment

by:andymacf
ID: 39986037
Hi gustav

I am trying to run this script against a different field in my database, and it is not bringing anything back, the only thing I notice is that the rtf data is slightly different will that be what is causing the issue:
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Times New Roman;}}
\viewkind4\uc1\pard\lang2057\b\f0\fs16\par
\fs24 Please record Long Case:\b0\fs20\par
\fs16\par
\b\fs24 1st Period - Candidate & Patient (10 minutes)\lang1033\b0\fs20\par
}

Open in new window

From this I want to extract 'Please record Long Case:' and '1st period - Candidate & Patient (10 minutes)'

Can you help please?
Andy
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39986098
When I use WordPad it returns:
Please record Long Case:

1st Period - Candidate & Patient (10 minutes)
So I cannot tell why it shouldn't work.

/gustav
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Experts-Exchange is a great place to come for help with solutions for your database issues, and many problems are resolved within minutes of being posted.  Others take a little more time and effort and often providing a sample database is very helpf…
Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
In Microsoft Access, learn different ways of passing a string value within a string argument. Also learn what a “Type Mis-match” error is about.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now