Solved

How to remove RTF code from a string?

Posted on 2011-02-17
10
1,241 Views
Last Modified: 2012-05-11
Hello Experts,

I've been trying for sometime now to remove all RTF codes from a file using the Java language. I've tried different approached; none to my satisfaction.

Thank You,
AshDash
0
Comment
Question by:AshDash
  • 4
  • 3
  • 3
10 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 34916103
Why do you want to - why not just ignore them?
0
 

Author Comment

by:AshDash
ID: 34916149
Well... I have the need... We are generating reports for the elements that comprise the files which contains the RTF codes. Now my generated report contains these unwanted RTF codes, which make no sence in the report. So I though of writing a java code to parse the files and remove all the RTF codes.... can you help...?

What do you mean by ignore them.... I guess, I cannot in this case?
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 500 total points
ID: 34916921
This is a rough and ready way of ignoring it:


public static String getPlain(String path) throws Exception {
        String result = null;
        RTFEditorKit kit = new RTFEditorKit();
        InputStream in = new FileInputStream(path);
        Document doc = new DefaultStyledDocument();
        kit.read(in, doc, 0);
        result = doc.getText(0, doc.getLength());

        return result;
    }

Open in new window

0
Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

 
LVL 92

Expert Comment

by:objects
ID: 34920094
try this:

http://helpdesk.objects.com.au/java/how-do-i-extract-just-the-text-form-a-html-document-ie-strip-out-all-the-html-tags

just replace

EditorKit editorKit = new HTMLEditorKit();

with:

EditorKit editorKit = new RTFEditorKit();
0
 
LVL 92

Expert Comment

by:objects
ID: 34920103
Why don't you just generate the reports in a different format?
0
 

Author Comment

by:AshDash
ID: 34922699
Thank you for your comments experts.

@CEHJ
I can see that your code works fine for RTF files, but my case is diferent. I do not have RTF files. The files are in a different format (consider .txt) which contains RTF codes.

An ideal scenario for a solution in this case would be to remove all RTF codes from an ascii string/file. If we can do this, I think my problem will be solved. Do we need to use regular expressions or can we still achieve this using the RTFEditorKit.

@objects
Though I did not try executing your solution yet, I believe the same constraint as discussed above would apply; considering I've plain text files with RTF codes in it.

Even if I try to generate reort in the RTF format, due to a bug in the tool, I get RFT codes are generated as it is in my generated report. Hence, forced to think of a workaround.

Thank you both in advance for further guidence and advice. Please help.
0
 
LVL 92

Expert Comment

by:objects
ID: 34922854
0
 

Author Comment

by:AshDash
ID: 35239245
If a better reqular expression solution using Java can be provided I would appreciate the same...
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35239257
>>The files are in a different format (consider .txt) which contains RTF codes.


Please post some examples
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 35399034
:)
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
Video by: Michael
Viewers learn about how to reduce the potential repetitiveness of coding in main by developing methods to perform specific tasks for their program. Additionally, objects are introduced for the purpose of learning how to call methods in Java. Define …
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question