[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3384
  • Last Modified:

java regular expressions - stripping html tags

OK I am trying to strip off all html tags but this doesn't work...why not?
lines[i].replaceAll("\\<.*\\>", "");

Asusming I have a string called htmlPage, how do I convert the <p> and <br> to new lines? htmlPage is a string containing the whole html page and is multiline.

0
rukiman
Asked:
rukiman
2 Solutions
 
cmalakarCommented:
htmlString.replaceAll("<p>", "\n");

will replace all <p> tags into new lines...

Similary you can do for <BR> tag
0
 
cmalakarCommented:
Also you can replace all tags by using..

htmlString = htmlString.replaceAll("<.*>", "");

dont forget, that replaceAll returns the resultant string..
0
 
objectsCommented:
you're using a greedy quantifier, try:

lines[i].replaceAll("\\<.*?\\>", "");

> how do I convert the <p> and <br> to new lines?

line.replaceAll("\\<p\\>", "\n").replaceAll("\\<br\\>", "\n");
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
cmalakarCommented:
Sorry...

typo mistake.

htmlString = htmlString.replaceAll("<.*>", "");

should be htmlString = htmlString.replaceAll("<[a-z]*>", "");
0
 
ysnkyCommented:
what you look for is;
lines[i].replaceAll("</.*?>", "").replaceAll("<.*?>", "\n");
0
 
rukimanAuthor Commented:
I accepted cmalakar as a solution as I was completely unaware that replaceAll returned the resultant string.
0

Featured Post

The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now