Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 190
  • Last Modified:

Search for a word in HTML

Hi all again! I need to make a function which searches for a given word in a specific web address. For example something like this.

public boolean search (String word,String WebAddress)--->and returns the word is in the page.

I need to use "javax.swing.text.html".
Is there a way to do that or i need to do something different, I have got a code that retrieves all the links in a website..

public String[] getLinks(String webaddress) {

        List result = new ArrayList();

                    try {
                        URL url = new URL(webaddress);
                        URLConnection conn = url.openConnection();
                        Reader rd = new InputStreamReader(conn.getInputStream());
                       
                        EditorKit kit = new HTMLEditorKit();
                        HTMLDocument doc = (HTMLDocument)kit.createDefaultDocument();
                        kit.read(rd, doc, 0);

                        HTMLDocument.Iterator it = doc.getIterator(HTML.Tag.A);
                        while (it.isValid()) {
                            SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();

                            String link = (String)s.getAttribute(HTML.Attribute.HREF);
                            if (link != null) {
                                result.add(link);
                                //gw.Insert_new_link (link);
                                System.out.println(link);
                            }
                            it.next();
                        }
                    } catch (MalformedURLException e) {System.out.println(e);}
                    //catch (IOException e) {}
                        //catch (Exception e) {System.out.println(e);}
                        catch (IOException e) {System.out.println(e);}
                        catch (BadLocationException e) {System.out.println(e);}
                    return (String[])result.toArray(new String[result.size()]);
}

Is it possible to change this little code to searhc for specific words in a HTM?? thanks!!
0
ticoldam12
Asked:
ticoldam12
  • 3
  • 2
1 Solution
 
ticoldam12Author Commented:
<public boolean search (String word,String WebAddress)--->and returns the word is in the page.>

Sorry, returns true if the word is in the page..
0
 
kiranhkCommented:
you can check out this link

http://www.edm2.com/0508/grinding.html
0
 
sudhakar_koundinyaCommented:
// This method takes a URI which can be either a filename (e.g. file://c:/dir/file.html)
    // or a URL (e.g. http://host.com/page.html) and returns all text in the document.
    public static String findText(String uriStr,String searchString) {
        final StringBuffer buf = new StringBuffer(1000);
   
        try {
            // Create an HTML document that appends all text to buf
            HTMLDocument doc = new HTMLDocument() {
                public HTMLEditorKit.ParserCallback getReader(int pos) {
                    return new HTMLEditorKit.ParserCallback() {
                        // This method is whenever text is encountered in the HTML file
                        public void handleText(char[] data, int pos) {
                            buf.append(data);
                            buf.append('\n');
                        }
                    };
                }
            };
   
            // Create a reader on the HTML content
            URL url = new URI(uriStr).toURL();
            URLConnection conn = url.openConnection();
            Reader rd = new InputStreamReader(conn.getInputStream());
   
            // Parse the HTML
            EditorKit kit = new HTMLEditorKit();
            kit.read(rd, doc, 0);
        } catch (MalformedURLException e) {
        } catch (URISyntaxException e) {
        } catch (BadLocationException e) {
        } catch (IOException e) {
        }
   
        // Return the text
        return buf.toString().indexOf(searchString)!=-1?searchString:"";
    }
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
sudhakar_koundinyaCommented:
public static String findText(String uriStr,String searchString) {
        final StringBuffer buf = new StringBuffer(1000);
   
        try {
            // Create an HTML document that appends all text to buf
            HTMLDocument doc = new HTMLDocument() {
                public HTMLEditorKit.ParserCallback getReader(int pos) {
                    return new HTMLEditorKit.ParserCallback() {
                        // This method is whenever text is encountered in the HTML file
                        public void handleText(char[] data, int pos) {
                            buf.append(data);
                            buf.append('\n');
                        }
                    };
                }
            };
   
            // Create a reader on the HTML content
            URL url = new URI(uriStr).toURL();
            URLConnection conn = url.openConnection();
            Reader rd = new InputStreamReader(conn.getInputStream());
   
            // Parse the HTML
            EditorKit kit = new HTMLEditorKit();
            kit.read(rd, doc, 0);
        } catch (Exception e) {
            ex.printStackTrace();
        }
   
        // Return the text
        return buf.toString().indexOf(searchString)!=-1?searchString:"";
    }
0
 
ticoldam12Author Commented:
Great!! I'll try the code as soon as i get home..=(
kiranhk, i found some very interesting things in that website, thanks for the info!!...
I will let you guys know if i have still problems with my method...
0
 
sudhakar_koundinyaCommented:
(:-)
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now