Solved

java regular expressions - stripping html tags

Posted on 2007-11-18
6
3,323 Views
Last Modified: 2012-08-13
OK I am trying to strip off all html tags but this doesn't work...why not?
lines[i].replaceAll("\\<.*\\>", "");

Asusming I have a string called htmlPage, how do I convert the <p> and <br> to new lines? htmlPage is a string containing the whole html page and is multiline.

0
Comment
Question by:rukiman
6 Comments
 
LVL 23

Expert Comment

by:cmalakar
Comment Utility
htmlString.replaceAll("<p>", "\n");

will replace all <p> tags into new lines...

Similary you can do for <BR> tag
0
 
LVL 23

Assisted Solution

by:cmalakar
cmalakar earned 60 total points
Comment Utility
Also you can replace all tags by using..

htmlString = htmlString.replaceAll("<.*>", "");

dont forget, that replaceAll returns the resultant string..
0
 
LVL 92

Accepted Solution

by:
objects earned 65 total points
Comment Utility
you're using a greedy quantifier, try:

lines[i].replaceAll("\\<.*?\\>", "");

> how do I convert the <p> and <br> to new lines?

line.replaceAll("\\<p\\>", "\n").replaceAll("\\<br\\>", "\n");
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 23

Expert Comment

by:cmalakar
Comment Utility
Sorry...

typo mistake.

htmlString = htmlString.replaceAll("<.*>", "");

should be htmlString = htmlString.replaceAll("<[a-z]*>", "");
0
 
LVL 9

Expert Comment

by:ysnky
Comment Utility
what you look for is;
lines[i].replaceAll("</.*?>", "").replaceAll("<.*?>", "\n");
0
 

Author Comment

by:rukiman
Comment Utility
I accepted cmalakar as a solution as I was completely unaware that replaceAll returned the resultant string.
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

For beginner Java programmers or at least those new to the Eclipse IDE, the following tutorial will show some (four) ways in which you can import your Java projects to your Eclipse workbench. Introduction While learning Java can be done with…
Introduction This article is the last of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers our test design approach and then goes through a simple test case example, how …
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now