Solved

How to remove RTF code from a string?

Posted on 2011-02-17
10
1,196 Views
Last Modified: 2012-05-11
Hello Experts,

I've been trying for sometime now to remove all RTF codes from a file using the Java language. I've tried different approached; none to my satisfaction.

Thank You,
AshDash
0
Comment
Question by:AshDash
  • 4
  • 3
  • 3
10 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 34916103
Why do you want to - why not just ignore them?
0
 

Author Comment

by:AshDash
ID: 34916149
Well... I have the need... We are generating reports for the elements that comprise the files which contains the RTF codes. Now my generated report contains these unwanted RTF codes, which make no sence in the report. So I though of writing a java code to parse the files and remove all the RTF codes.... can you help...?

What do you mean by ignore them.... I guess, I cannot in this case?
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 500 total points
ID: 34916921
This is a rough and ready way of ignoring it:


public static String getPlain(String path) throws Exception {
        String result = null;
        RTFEditorKit kit = new RTFEditorKit();
        InputStream in = new FileInputStream(path);
        Document doc = new DefaultStyledDocument();
        kit.read(in, doc, 0);
        result = doc.getText(0, doc.getLength());

        return result;
    }

Open in new window

0
 
LVL 92

Expert Comment

by:objects
ID: 34920094
try this:

http://helpdesk.objects.com.au/java/how-do-i-extract-just-the-text-form-a-html-document-ie-strip-out-all-the-html-tags

just replace

EditorKit editorKit = new HTMLEditorKit();

with:

EditorKit editorKit = new RTFEditorKit();
0
 
LVL 92

Expert Comment

by:objects
ID: 34920103
Why don't you just generate the reports in a different format?
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:AshDash
ID: 34922699
Thank you for your comments experts.

@CEHJ
I can see that your code works fine for RTF files, but my case is diferent. I do not have RTF files. The files are in a different format (consider .txt) which contains RTF codes.

An ideal scenario for a solution in this case would be to remove all RTF codes from an ascii string/file. If we can do this, I think my problem will be solved. Do we need to use regular expressions or can we still achieve this using the RTFEditorKit.

@objects
Though I did not try executing your solution yet, I believe the same constraint as discussed above would apply; considering I've plain text files with RTF codes in it.

Even if I try to generate reort in the RTF format, due to a bug in the tool, I get RFT codes are generated as it is in my generated report. Hence, forced to think of a workaround.

Thank you both in advance for further guidence and advice. Please help.
0
 
LVL 92

Expert Comment

by:objects
ID: 34922854
0
 

Author Comment

by:AshDash
ID: 35239245
If a better reqular expression solution using Java can be provided I would appreciate the same...
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35239257
>>The files are in a different format (consider .txt) which contains RTF codes.


Please post some examples
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35399034
:)
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
what is the difference between "sudo su" and "su - root" 6 101
eclipse formatting 6 68
Image decoding from Camera 3 69
Where to store the queries for modification of table 4 52
For customizing the look of your lightweight component and making it look lucid like it was made of glass. Or: how to make your component more Apple-ish ;) This tip assumes your component to be of rectangular shape and completely opaque. (COD…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
The viewer will learn how to implement Singleton Design Pattern in Java.
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …

947 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now