Solved

Java method/function to html encode/decode a string

Posted on 2002-06-09
14
1,632 Views
Last Modified: 2013-11-23
Any one please let me know what is the class/method/function to html encode or decode a string in java (jdk).

eg: all > symbols would become &gt


Thanks in advance.

0
Comment
Question by:srinusimhadri
14 Comments
 
LVL 27

Expert Comment

by:rrz
ID: 7066152
Is this what you want?
URLDecoder - class java.net.URLDecoder.
Utility class for HTML form decoding.
URLDecoder() - Constructor for class java.net.URLDecoder
 
URLEncoder - class java.net.URLEncoder.
Utility class for HTML form encoding.
0
 

Author Comment

by:srinusimhadri
ID: 7066192
no,

I was referring to following...


symbol    converted to
------    ------------
>         &gt
<         &lt

lets say u have a value ">= 0.5"
should be converted to "&gt= 0.5"

and some other characters which are not allowed directly inside html.


0
 
LVL 9

Expert Comment

by:Ovi
ID: 7066240
You must use HTMLEditorKit and HTMLDocument to parse the string as html, and retrieve the "encoded" content. The swing html parser will convert itself special characters to html tags.
0
 

Author Comment

by:srinusimhadri
ID: 7066301
Please let me know in which package of the jdk I can find these classes.
0
 
LVL 9

Expert Comment

by:Ovi
ID: 7066572
For simplicity you should use JEditorPane(javax.swing.JEditorPane) as a component which displays html code (you set to it a default HTMLDocument and a default HTMLEditorKit) and only call for the method getText(), which will return, if the component is set correctly the html code of your string.

The package for handling html content is :
javax.swing.text.html;
0
 
LVL 9

Expert Comment

by:Ovi
ID: 7066573
I will write you a short example soon.
0
 
LVL 9

Expert Comment

by:Ovi
ID: 7066605
import java.awt.*;
import java.io.*;
import javax.swing.*;
import javax.swing.event.*;
import javax.swing.text.html.*;


public class HTMLEncoder extends JFrame {
  protected JEditorPane editor;
  protected JTextArea source;
  protected HTMLEditorKit kit;
  protected HTMLDocument doc;

  public HTMLEncoder() {
    super("HTMLEncoder");
    init();
  }

  private void init() {
    Container c = getContentPane();
    c.setLayout(new GridLayout(2, 1, 5, 5));

    editor = new JEditorPane();
    kit = new HTMLEditorKit();
    doc = (HTMLDocument) kit.createDefaultDocument();
    editor.setEditorKit(kit);
    editor.setDocument(doc);
    kit.install(editor);

    source = new JTextArea();
    c.add(new JScrollPane(editor));
    c.add(new JScrollPane(source));

    doc.addDocumentListener(new DocumentAdapter());

    setSize(500, 500);
    setLocation(200, 200);
    setVisible(true);
  }
 
  public void read(String text) {
    editor.setText("");
    try {
      kit.read(new StringReader(text), doc, 0);
    } catch(Exception e) {
      System.out.println("Error while encoding!");
      e.printStackTrace();
    }
  }
 
  public String getEncodedContent() {
    return(editor.getText());
  }
 
  class DocumentAdapter implements DocumentListener {
    public void removeUpdate(DocumentEvent ce) {
      source.setText(editor.getText());
    }
    public void changedUpdate(DocumentEvent ce) {
     source.setText(editor.getText());
    }
    public void insertUpdate(DocumentEvent ce) {
      source.setText(editor.getText());
    }
  }

  public static void main(String[] args) {
    new HTMLEncoder();
  }
}
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 9

Accepted Solution

by:
Ovi earned 50 total points
ID: 7066632
Another option is to build the parser yourself like in the bellow example. There the characters "<", ">", "\n" will be replaced by their corespondent html : "&lt", "&gt", "<br>".


public class test {
  public static void main(String[] args) {
    String[] tokens = new String[] {"<", ">", "\n"};
    String[] replacement = new String[] {"&lt", "&gt", "<br>"};

    String text = "This> is <a test\n text >for< a simple toHTML convertor.";
    System.out.println("Original : " + text);
    StringBuffer sb = new StringBuffer(text);
    for(int i = 0; i<tokens.length; i++) {
      int idx = 0;
      while((idx = sb.indexOf(tokens[i], idx)) != -1)
        sb.replace(idx, idx + tokens[i].length(), replacement[i]);
    }
    text = sb.toString();
    System.out.println("Processed : " + text);
  }
}
0
 
LVL 92

Expert Comment

by:objects
ID: 7068619
Ovi,

Have to remember that you don't want to convert *all* '<', '>' etc.
0
 

Author Comment

by:srinusimhadri
ID: 7068821
ovi,

thank u very much for the suggestions...

Actually I could also write my own program to do that.
and I know the logic for that.
but I dont want to write it on my own.
since URLEncoder is already there, I thought HTMLEncode also will be there.

Extending JFrame is not possible for me,because I want a plain non-interactive method.

please let me know any other way.
0
 
LVL 9

Expert Comment

by:Ovi
ID: 7069129
For objects : "Any one please let me know what is the class/method/function to html encode or decode a string in java (jdk).

eg: all > symbols would become &gt " - this is the question content - so I've writted the second example having in mind that the source string is not html. I am aware about the algorithm problem when handling html text, but again was not an intention to implement a html parser.

For srinusimhadri :
1. you can use the second example with extended token string array and their replacement, but only if your original text is not html source.
2. You can use the first comment too. If you read'it with care you will see that you have there two methods : read(String) and getEncodedContent() which are provided as a starting point for a nongraphics application. For tests, you should comment the setVisible(boolean) method call from the init() method, so you'll never see again the window poping up on your screen. After nthat you can use the read() method, passing'it your string and getEncodedContent() for retrieving the encoded html. To be more clear, again, comment the setVisible call, and put this in your main method :

  HTMLEncoder e = new HTMLEncoder();
  e.read("test< encoding");
  System.out.println("Encoded : " + e.getEncodedContent());


You could have a problem if the swing parser does not finish it's job since the parsing is asynchronus and retrieve invalid content. You should solve that by synchronization.
0
 

Expert Comment

by:wh111
ID: 7074916
the only 4 chars you should care are "&" "\"" ">" "<"
so.....

 public static String HTMLEscape(String str) {
      String t = replaceSubString(str,"&","&amp;");
          t = replaceSubString(t,"\"","&quot;");
          t = replaceSubString(t,">","&gt;");
          t = replaceSubString(t,"<","&lt;");
          return t;
 }

 public static String replaceSubString(String str,String pattern,String replace) {
     int slen = str.length();
     int plen = pattern.length();
     int s = 0, e = 0;
     StringBuffer result = new StringBuffer(slen * 2);
     char[] chars = new char[slen];

     while ((e = str.indexOf(pattern, s)) >= 0) {
          str.getChars(s, e, chars, 0);
          result.append(chars, 0, e - s).append(replace);
          s = e + plen;
     }
     str.getChars(s, slen, chars, 0);
     result.append(chars, 0, slen - s);
     return result.toString();
 }

0
 

Author Comment

by:srinusimhadri
ID: 8443201
Sorry for the delay.
Thanks for the help.
0
 
LVL 9

Expert Comment

by:Ovi
ID: 8462027
me too for the points :)
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

After being asked a question last year, I went into one of my moods where I did some research and code just for the fun and learning of it all.  Subsequently, from this journey, I put together this article on "Range Searching Using Visual Basic.NET …
Introduction This article is the last of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers our test design approach and then goes through a simple test case example, how …
The viewer will learn how to implement Singleton Design Pattern in Java.
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now