Java method/function to html encode/decode a string

Any one please let me know what is the class/method/function to html encode or decode a string in java (jdk).

eg: all > symbols would become &gt


Thanks in advance.

srinusimhadriAsked:
Who is Participating?
 
OviConnect With a Mentor Commented:
Another option is to build the parser yourself like in the bellow example. There the characters "<", ">", "\n" will be replaced by their corespondent html : "&lt", "&gt", "<br>".


public class test {
  public static void main(String[] args) {
    String[] tokens = new String[] {"<", ">", "\n"};
    String[] replacement = new String[] {"&lt", "&gt", "<br>"};

    String text = "This> is <a test\n text >for< a simple toHTML convertor.";
    System.out.println("Original : " + text);
    StringBuffer sb = new StringBuffer(text);
    for(int i = 0; i<tokens.length; i++) {
      int idx = 0;
      while((idx = sb.indexOf(tokens[i], idx)) != -1)
        sb.replace(idx, idx + tokens[i].length(), replacement[i]);
    }
    text = sb.toString();
    System.out.println("Processed : " + text);
  }
}
0
 
rrzCommented:
Is this what you want?
URLDecoder - class java.net.URLDecoder.
Utility class for HTML form decoding.
URLDecoder() - Constructor for class java.net.URLDecoder
 
URLEncoder - class java.net.URLEncoder.
Utility class for HTML form encoding.
0
 
srinusimhadriAuthor Commented:
no,

I was referring to following...


symbol    converted to
------    ------------
>         &gt
<         &lt

lets say u have a value ">= 0.5"
should be converted to "&gt= 0.5"

and some other characters which are not allowed directly inside html.


0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
OviCommented:
You must use HTMLEditorKit and HTMLDocument to parse the string as html, and retrieve the "encoded" content. The swing html parser will convert itself special characters to html tags.
0
 
srinusimhadriAuthor Commented:
Please let me know in which package of the jdk I can find these classes.
0
 
OviCommented:
For simplicity you should use JEditorPane(javax.swing.JEditorPane) as a component which displays html code (you set to it a default HTMLDocument and a default HTMLEditorKit) and only call for the method getText(), which will return, if the component is set correctly the html code of your string.

The package for handling html content is :
javax.swing.text.html;
0
 
OviCommented:
I will write you a short example soon.
0
 
OviCommented:
import java.awt.*;
import java.io.*;
import javax.swing.*;
import javax.swing.event.*;
import javax.swing.text.html.*;


public class HTMLEncoder extends JFrame {
  protected JEditorPane editor;
  protected JTextArea source;
  protected HTMLEditorKit kit;
  protected HTMLDocument doc;

  public HTMLEncoder() {
    super("HTMLEncoder");
    init();
  }

  private void init() {
    Container c = getContentPane();
    c.setLayout(new GridLayout(2, 1, 5, 5));

    editor = new JEditorPane();
    kit = new HTMLEditorKit();
    doc = (HTMLDocument) kit.createDefaultDocument();
    editor.setEditorKit(kit);
    editor.setDocument(doc);
    kit.install(editor);

    source = new JTextArea();
    c.add(new JScrollPane(editor));
    c.add(new JScrollPane(source));

    doc.addDocumentListener(new DocumentAdapter());

    setSize(500, 500);
    setLocation(200, 200);
    setVisible(true);
  }
 
  public void read(String text) {
    editor.setText("");
    try {
      kit.read(new StringReader(text), doc, 0);
    } catch(Exception e) {
      System.out.println("Error while encoding!");
      e.printStackTrace();
    }
  }
 
  public String getEncodedContent() {
    return(editor.getText());
  }
 
  class DocumentAdapter implements DocumentListener {
    public void removeUpdate(DocumentEvent ce) {
      source.setText(editor.getText());
    }
    public void changedUpdate(DocumentEvent ce) {
     source.setText(editor.getText());
    }
    public void insertUpdate(DocumentEvent ce) {
      source.setText(editor.getText());
    }
  }

  public static void main(String[] args) {
    new HTMLEncoder();
  }
}
0
 
objectsCommented:
Ovi,

Have to remember that you don't want to convert *all* '<', '>' etc.
0
 
srinusimhadriAuthor Commented:
ovi,

thank u very much for the suggestions...

Actually I could also write my own program to do that.
and I know the logic for that.
but I dont want to write it on my own.
since URLEncoder is already there, I thought HTMLEncode also will be there.

Extending JFrame is not possible for me,because I want a plain non-interactive method.

please let me know any other way.
0
 
OviCommented:
For objects : "Any one please let me know what is the class/method/function to html encode or decode a string in java (jdk).

eg: all > symbols would become &gt " - this is the question content - so I've writted the second example having in mind that the source string is not html. I am aware about the algorithm problem when handling html text, but again was not an intention to implement a html parser.

For srinusimhadri :
1. you can use the second example with extended token string array and their replacement, but only if your original text is not html source.
2. You can use the first comment too. If you read'it with care you will see that you have there two methods : read(String) and getEncodedContent() which are provided as a starting point for a nongraphics application. For tests, you should comment the setVisible(boolean) method call from the init() method, so you'll never see again the window poping up on your screen. After nthat you can use the read() method, passing'it your string and getEncodedContent() for retrieving the encoded html. To be more clear, again, comment the setVisible call, and put this in your main method :

  HTMLEncoder e = new HTMLEncoder();
  e.read("test< encoding");
  System.out.println("Encoded : " + e.getEncodedContent());


You could have a problem if the swing parser does not finish it's job since the parsing is asynchronus and retrieve invalid content. You should solve that by synchronization.
0
 
wh111Commented:
the only 4 chars you should care are "&" "\"" ">" "<"
so.....

 public static String HTMLEscape(String str) {
      String t = replaceSubString(str,"&","&amp;");
          t = replaceSubString(t,"\"","&quot;");
          t = replaceSubString(t,">","&gt;");
          t = replaceSubString(t,"<","&lt;");
          return t;
 }

 public static String replaceSubString(String str,String pattern,String replace) {
     int slen = str.length();
     int plen = pattern.length();
     int s = 0, e = 0;
     StringBuffer result = new StringBuffer(slen * 2);
     char[] chars = new char[slen];

     while ((e = str.indexOf(pattern, s)) >= 0) {
          str.getChars(s, e, chars, 0);
          result.append(chars, 0, e - s).append(replace);
          s = e + plen;
     }
     str.getChars(s, slen, chars, 0);
     result.append(chars, 0, slen - s);
     return result.toString();
 }

0
 
srinusimhadriAuthor Commented:
Sorry for the delay.
Thanks for the help.
0
 
OviCommented:
me too for the points :)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.