Solved

extract text from rtf

Posted on 2014-01-10
21
933 Views
Last Modified: 2014-04-08
Hi experts

Is there a way of extracting the actual text component from rtf format as shown below:

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil Times New Roman;}{\f1\fnil\fcharset0 Times New Roman;}}
{\colortbl ;\red0\green0\blue0;}
\viewkind4\uc1\pard\cf1\f0\fs22\par
\cf0\lang2057\b\f1\fs24\par
\par
2nd Period - (5 minutes)\cf1\lang1033\b0\f0\fs22\par
}

Open in new window


in this example, I would like to extract '2nd Period - (5 Minutes)'
0
Comment
Question by:andymacf
  • 9
  • 6
  • 3
  • +2
21 Comments
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39771001
You might want to try the PlainText() function.

It has been a while since I've used the Rich Text format in Access database, but that does not look like the format I remember.  Where is this data coming from?  Was it created in Access?
0
 
LVL 7

Author Comment

by:andymacf
ID: 39771072
Hi Dale

The data is stored in a SQL DB, but I use access Linked Table Manager to run queries/reports and i need to extract the text to use as field identifiers in queries and reports

Thanks
Andy
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39771150
I'm just trying to figure out where those codes came from?

Is that cut from some Word or Internet page?  It does not look similar to the Rich Text markup that Access creates when you create an RTF memo field and format data.  Because of that, I seriously doubt that the PlainText( ) function in Access will strip out all of the extraneous codes.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39771176
0
 
LVL 35

Expert Comment

by:PatHartman
ID: 39771188
You are going to need to figure out how to interpret the tags.  Microsoft publishes the RTF spec.  I don't have the document handy but I downloaded it a few years ago when I had to create a RTF formatted document and I actually saw a reference to it recently but I don't have time to go searching.  

Reading an RTF document will be more difficult but essentially, you need to use instr() to find the tag that starts the text and then instr() again to find the tag that ends the text.  Then you would have to find any tags in that part of the document that were used to apply formatting and get rid of them.  Good Luck.
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39771196
Gustav,  

Great info; added that to my personal knowledgebase.

What reference is required for that Rich Text Control?
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 39773074
I have no idea whether this is truly RTF compliant.

If Gustav's suggestion does not work, please let us know, and describe in full how the encoding is supposed to work.  I may be able to gin up a regular expression for it.
0
 
LVL 35

Expert Comment

by:PatHartman
ID: 39773669
Hi,
I found a copy of the RTF specification if you need it.
Word-2002-RTF-Specification-Fina.doc
0
 
LVL 7

Author Comment

by:andymacf
ID: 39773916
Thank you everyone for your comments, other events took priority on Friday, will get back to this next week.

Andy
0
 
LVL 7

Author Comment

by:andymacf
ID: 39778870
Hi Gustav

How do I know if I have access to the RichText control?

Thanks
Andy
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39778901
Try running the code.

You probably needs to install and register the ActiveX control:

http://stackoverflow.com/questions/6335546/register-richtx32-ocx-in-windows-7-64-bit

I believe you can download it here:

http://www.ocxdump.com/download-ocx-files_new.php/ocxfiles/R/RICHTX32.OCX/6.01.9782/download.html

/gustav
0
 
LVL 7

Author Comment

by:andymacf
ID: 39778932
Thanks gustav

Do I need to call the code from within my query, can you enlighten me please?

Andy
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39778969
Yes. You could use the function below in an expression:

SomeFieldTxt: ConvVarRTFtoText([SomeFieldRtf])

After running the query you can clear the OCX from memory:

        ' Reset call of Rich Text ActiveXBox in query.
        Call ConvRTFtoText(vbNullString, True)

Public Function ConvVarRTFtoText( _
  ByVal varRTF As Variant, _
  Optional booSingleRun As Boolean) _
  As String

' Converts, if possible, RTF formatted variant to plain text using
' RTF ActiveX control.
' Null as varRTF returns zero length string.
' When finished, Rich Text object is removed if booSingleRun is True.

  Dim strRTF  As String
  Dim strTmp  As String
  
  On Error GoTo Err_ConvVarRTFtoText
  
  If Not IsNull(varRTF) Then
    strTmp = CStr(varRTF)
    strRTF = ConvRTFtoText(strTmp, booSingleRun)
  End If

  ConvVarRTFtoText = strRTF

Exit_ConvVarRTFtoText:
  Exit Function

Err_ConvVarRTFtoText:
  Resume Exit_ConvVarRTFtoText

End Function

Open in new window

/gustav
0
 
LVL 7

Author Comment

by:andymacf
ID: 39778973
Tried that, get a 'Undefined function 'ConvVarRTFtoText' in expression' error

Andy
0
 
LVL 7

Author Comment

by:andymacf
ID: 39778974
Realised that you had added Var into the title, removed and got the following error:

Error 429. ActiveX compenent can't create object

Andy
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39778981
ConvVarRTFtoText is the helper function in my message above.

The error tells that the component isn't registered. Study the previously posted link or bing/google on: Register RichTx32.ocx

/gustav
0
 
LVL 7

Author Comment

by:andymacf
ID: 39779002
Hi gustav

I will need IT dept to download file so will let you knwo how I get on.

Thanks
Andy
0
 
LVL 49

Accepted Solution

by:
Gustav Brock earned 300 total points
ID: 39779032
Thanks. Good luck!

/gustav
0
 
LVL 7

Author Closing Comment

by:andymacf
ID: 39782410
Hi gustav

Job done and works really well, thank you.

Save me a lot of pain :)

Cheers
Andy
0
 
LVL 7

Author Comment

by:andymacf
ID: 39986037
Hi gustav

I am trying to run this script against a different field in my database, and it is not bringing anything back, the only thing I notice is that the rtf data is slightly different will that be what is causing the issue:
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Times New Roman;}}
\viewkind4\uc1\pard\lang2057\b\f0\fs16\par
\fs24 Please record Long Case:\b0\fs20\par
\fs16\par
\b\fs24 1st Period - Candidate & Patient (10 minutes)\lang1033\b0\fs20\par
}

Open in new window

From this I want to extract 'Please record Long Case:' and '1st period - Candidate & Patient (10 minutes)'

Can you help please?
Andy
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 39986098
When I use WordPad it returns:
Please record Long Case:

1st Period - Candidate & Patient (10 minutes)
So I cannot tell why it shouldn't work.

/gustav
0

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A simple tool to export all objects of two Access files as text and compare it with Meld, a free diff tool.
Phishing attempts can come in all forms, shapes and sizes. No matter how familiar you think you are with them, always remember to take extra precaution when opening an email with attachments or links.
Learn how to number pages in an Access report over each group. Activate two pass printing by referencing the pages property: Add code to the Page Footers OnFormat event to capture the pages as there occur for each group. Use the pages property to …
Basics of query design. Shows you how to construct a simple query by adding tables, perform joins, defining output columns, perform sorting, and apply criteria.

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now