Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

How to remove Script tags from HTML

Posted on 2004-10-12
4
Medium Priority
?
586 Views
Last Modified: 2008-01-09
I want to remove all <script> tags with thier contents from within an HTML document.
I tried to replace it with blank space but following exception occurrs.
java.util.regex.PatternSyntaxException: Illegal repetition near index 86

can any one help ?
here is the code.


  private String removeTagsWithContents(String tagName, String data)
  {
    String cleanData = "";
    boolean hasMoreTags;
   
    if(data.indexOf(tagName) > 0)
      hasMoreTags = true;
    else
      hasMoreTags = false;
     
    while(hasMoreTags)
    {
      System.out.println(data);
      String strFound = data.substring(data.indexOf("<" + tagName ), data.indexOf("</" + tagName + ">") + (3 + tagName.length()));
      System.out.println(strFound);
      strFound = "";
      System.out.println(data.indexOf(strFound));
      data = data.replaceAll(strFound, " ");
      System.out.println(data);

      if(data.indexOf(tagName) > 0)
        hasMoreTags = true;
      else
        hasMoreTags = false;    
    }

    return cleanData;
  }
0
Comment
Question by:Naeemg
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 37

Accepted Solution

by:
zzynx earned 200 total points
ID: 12295360
So you want every occurrence of "<script>blah blah blah </script>" to be removed. Right?

System.out.println("abc <script>sf fgk,#@qsdf qdfg</script> def".replaceAll( "<script>([\\W\\w\\s])*</script>", "") );

So:

 private String removeTagsWithContents(String tagName, String data) {
     String regExp = "<" + tagName + ">([\\W\\w\\s])*</" + tagName + ">";
     return data.replaceAll(regExp, "");
 }
0
 
LVL 7

Assisted Solution

by:tomboshell
tomboshell earned 200 total points
ID: 12295377
Here you assign the discovered string to strFound >>    String strFound = data.substring(data.indexOf("<" + tagName ), data.indexOf("</" + tagName + ">") + (3 + tagName.length()));
   
Here the discovered string is set to an empty string. The beginnining of the problem!>>      strFound = "";
   
Here is the problem. You are saying to replace an empty string with a blank space      data = data.replaceAll(strFound, " ");

You should either not set the discovered string to an empty string, or do the removal before that.  Best is to not reassign
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

After being asked a question last year, I went into one of my moods where I did some research and code just for the fun and learning of it all.  Subsequently, from this journey, I put together this article on "Range Searching Using Visual Basic.NET …
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
Viewers will learn about basic arrays, how to declare them, and how to use them. Introduction and definition: Declare an array and cover the syntax of declaring them: Initialize every index in the created array: Example/Features of a basic arr…
Suggested Courses

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question