Solved

Java method/function to html encode/decode a string

Posted on 2002-06-09
14
1,739 Views
Last Modified: 2013-11-23
Any one please let me know what is the class/method/function to html encode or decode a string in java (jdk).

eg: all > symbols would become &gt


Thanks in advance.

0
Comment
Question by:srinusimhadri
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
14 Comments
 
LVL 27

Expert Comment

by:rrz
ID: 7066152
Is this what you want?
URLDecoder - class java.net.URLDecoder.
Utility class for HTML form decoding.
URLDecoder() - Constructor for class java.net.URLDecoder
 
URLEncoder - class java.net.URLEncoder.
Utility class for HTML form encoding.
0
 

Author Comment

by:srinusimhadri
ID: 7066192
no,

I was referring to following...


symbol    converted to
------    ------------
>         &gt
<         &lt

lets say u have a value ">= 0.5"
should be converted to "&gt= 0.5"

and some other characters which are not allowed directly inside html.


0
 
LVL 9

Expert Comment

by:Ovi
ID: 7066240
You must use HTMLEditorKit and HTMLDocument to parse the string as html, and retrieve the "encoded" content. The swing html parser will convert itself special characters to html tags.
0
Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

 

Author Comment

by:srinusimhadri
ID: 7066301
Please let me know in which package of the jdk I can find these classes.
0
 
LVL 9

Expert Comment

by:Ovi
ID: 7066572
For simplicity you should use JEditorPane(javax.swing.JEditorPane) as a component which displays html code (you set to it a default HTMLDocument and a default HTMLEditorKit) and only call for the method getText(), which will return, if the component is set correctly the html code of your string.

The package for handling html content is :
javax.swing.text.html;
0
 
LVL 9

Expert Comment

by:Ovi
ID: 7066573
I will write you a short example soon.
0
 
LVL 9

Expert Comment

by:Ovi
ID: 7066605
import java.awt.*;
import java.io.*;
import javax.swing.*;
import javax.swing.event.*;
import javax.swing.text.html.*;


public class HTMLEncoder extends JFrame {
  protected JEditorPane editor;
  protected JTextArea source;
  protected HTMLEditorKit kit;
  protected HTMLDocument doc;

  public HTMLEncoder() {
    super("HTMLEncoder");
    init();
  }

  private void init() {
    Container c = getContentPane();
    c.setLayout(new GridLayout(2, 1, 5, 5));

    editor = new JEditorPane();
    kit = new HTMLEditorKit();
    doc = (HTMLDocument) kit.createDefaultDocument();
    editor.setEditorKit(kit);
    editor.setDocument(doc);
    kit.install(editor);

    source = new JTextArea();
    c.add(new JScrollPane(editor));
    c.add(new JScrollPane(source));

    doc.addDocumentListener(new DocumentAdapter());

    setSize(500, 500);
    setLocation(200, 200);
    setVisible(true);
  }
 
  public void read(String text) {
    editor.setText("");
    try {
      kit.read(new StringReader(text), doc, 0);
    } catch(Exception e) {
      System.out.println("Error while encoding!");
      e.printStackTrace();
    }
  }
 
  public String getEncodedContent() {
    return(editor.getText());
  }
 
  class DocumentAdapter implements DocumentListener {
    public void removeUpdate(DocumentEvent ce) {
      source.setText(editor.getText());
    }
    public void changedUpdate(DocumentEvent ce) {
     source.setText(editor.getText());
    }
    public void insertUpdate(DocumentEvent ce) {
      source.setText(editor.getText());
    }
  }

  public static void main(String[] args) {
    new HTMLEncoder();
  }
}
0
 
LVL 9

Accepted Solution

by:
Ovi earned 50 total points
ID: 7066632
Another option is to build the parser yourself like in the bellow example. There the characters "<", ">", "\n" will be replaced by their corespondent html : "&lt", "&gt", "<br>".


public class test {
  public static void main(String[] args) {
    String[] tokens = new String[] {"<", ">", "\n"};
    String[] replacement = new String[] {"&lt", "&gt", "<br>"};

    String text = "This> is <a test\n text >for< a simple toHTML convertor.";
    System.out.println("Original : " + text);
    StringBuffer sb = new StringBuffer(text);
    for(int i = 0; i<tokens.length; i++) {
      int idx = 0;
      while((idx = sb.indexOf(tokens[i], idx)) != -1)
        sb.replace(idx, idx + tokens[i].length(), replacement[i]);
    }
    text = sb.toString();
    System.out.println("Processed : " + text);
  }
}
0
 
LVL 92

Expert Comment

by:objects
ID: 7068619
Ovi,

Have to remember that you don't want to convert *all* '<', '>' etc.
0
 

Author Comment

by:srinusimhadri
ID: 7068821
ovi,

thank u very much for the suggestions...

Actually I could also write my own program to do that.
and I know the logic for that.
but I dont want to write it on my own.
since URLEncoder is already there, I thought HTMLEncode also will be there.

Extending JFrame is not possible for me,because I want a plain non-interactive method.

please let me know any other way.
0
 
LVL 9

Expert Comment

by:Ovi
ID: 7069129
For objects : "Any one please let me know what is the class/method/function to html encode or decode a string in java (jdk).

eg: all > symbols would become &gt " - this is the question content - so I've writted the second example having in mind that the source string is not html. I am aware about the algorithm problem when handling html text, but again was not an intention to implement a html parser.

For srinusimhadri :
1. you can use the second example with extended token string array and their replacement, but only if your original text is not html source.
2. You can use the first comment too. If you read'it with care you will see that you have there two methods : read(String) and getEncodedContent() which are provided as a starting point for a nongraphics application. For tests, you should comment the setVisible(boolean) method call from the init() method, so you'll never see again the window poping up on your screen. After nthat you can use the read() method, passing'it your string and getEncodedContent() for retrieving the encoded html. To be more clear, again, comment the setVisible call, and put this in your main method :

  HTMLEncoder e = new HTMLEncoder();
  e.read("test< encoding");
  System.out.println("Encoded : " + e.getEncodedContent());


You could have a problem if the swing parser does not finish it's job since the parsing is asynchronus and retrieve invalid content. You should solve that by synchronization.
0
 

Expert Comment

by:wh111
ID: 7074916
the only 4 chars you should care are "&" "\"" ">" "<"
so.....

 public static String HTMLEscape(String str) {
      String t = replaceSubString(str,"&","&amp;");
          t = replaceSubString(t,"\"","&quot;");
          t = replaceSubString(t,">","&gt;");
          t = replaceSubString(t,"<","&lt;");
          return t;
 }

 public static String replaceSubString(String str,String pattern,String replace) {
     int slen = str.length();
     int plen = pattern.length();
     int s = 0, e = 0;
     StringBuffer result = new StringBuffer(slen * 2);
     char[] chars = new char[slen];

     while ((e = str.indexOf(pattern, s)) >= 0) {
          str.getChars(s, e, chars, 0);
          result.append(chars, 0, e - s).append(replace);
          s = e + plen;
     }
     str.getChars(s, slen, chars, 0);
     result.append(chars, 0, slen - s);
     return result.toString();
 }

0
 

Author Comment

by:srinusimhadri
ID: 8443201
Sorry for the delay.
Thanks for the help.
0
 
LVL 9

Expert Comment

by:Ovi
ID: 8462027
me too for the points :)
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
This was posted to the Netbeans forum a Feb, 2010 and I also sent it to Verisign. Who didn't help much in my struggles to get my application signed. ------------------------- Start The idea here is to target your cell phones with the correct…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
Viewers will learn about basic arrays, how to declare them, and how to use them. Introduction and definition: Declare an array and cover the syntax of declaring them: Initialize every index in the created array: Example/Features of a basic arr…

695 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question