?
Solved

Regular Expression in Java (html tags)

Posted on 2007-08-05
13
Medium Priority
?
346 Views
Last Modified: 2012-05-05
I need a regular expression that works in Java that will take every <img src="whatever" .... > and insert this code inside the tag:

onmousedown="cancelBubbles();__cloneCallback(this.tagName, getElement()); return false;"

So the new string will be like:

OLD:  <img src="whatever" .... > 

NEW: <img onmousedown="cancelBubbles();__cloneCallback(this.tagName, getElement()); return false;" src="whatever" .... > 
0
Comment
Question by:dignified
  • 5
  • 4
  • 2
  • +1
13 Comments
 
LVL 2

Expert Comment

by:freeexpert
ID: 19636757
String.replaceAll(Pattern.quote("<img src="), "<img onmousedown=\"cancelBubbles();__cloneCallback(this.tagName, getElement()); return false;\" src="");

But you do understand all the cases where this will not quite do what you want, right?
0
 

Author Comment

by:dignified
ID: 19636793
Maybe, can you explain further?
0
 
LVL 16

Expert Comment

by:ellandrd
ID: 19636807
Is your page coded in JSP?  if so then this will help you:

String oldstring = "<img src="whatever" .... >"
String newstring =  "<img onmousedown="cancelBubbles();__cloneCallback(this.tagName, getElement()); return false;" src="whatever" .... >"

oldstring = oldstring.replaceAll(oldstring,newstring)

other info:

http://java.sun.com/developer/technicalArticles/releases/1.4regex/
http://www.sitepoint.com/article/java-regex-api-explained

ellandrd
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 86

Expert Comment

by:CEHJ
ID: 19637061
tag = tag.replaceAll("(<img [^>]*?)>", "$1 onmousedown=\"cancelBubbles();__cloneCallback(this.tagName, getElement()); return false;\"");
0
 
LVL 2

Expert Comment

by:freeexpert
ID: 19637080
> Maybe, can you explain further?

Are you going to be able to ensure that the replacement is only applied to tag? There are some cases when you might end up applying this unintentionally, e.g. when the pattern appears:

- in comments
- in CDATA
- Any string which is going to be xml encoded before it appears in the html.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 19637087
Sorry - typo on my last. Should be

tag = tag.replaceAll("(<img [^>]*?)>", "$1 onmousedown=\"cancelBubbles();__cloneCallback(this.tagName, getElement()); return false;\">");
0
 
LVL 2

Expert Comment

by:freeexpert
ID: 19637559
> CEHJ:
>      Sorry - typo on my last. Should be

Is there something wrong with the solution I gave, except that there is probably an extra double quote?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 19637836
>>Is there something wrong with the solution I gave, except that there is probably an extra double quote?

I mustn't be quite awake this morning - what i gave doesn't address the problem. There's no obligation for the src attribute to appear first.

A better way would be to use an html parsing library. Try the Neko parser

0
 
LVL 16

Expert Comment

by:ellandrd
ID: 19637856
what is wrong with my suggestion at inserting the new attributes before the src attribute?
0
 
LVL 2

Expert Comment

by:freeexpert
ID: 19637894
> I mustn't be quite awake this morning - what i gave doesn't address the problem. There's no obligation for the src attribute to appear first.


Oh of course. I was only providing a very limited solution for replace <img src"> with the OP's replacement.
0
 

Author Comment

by:dignified
ID: 19643451
The onmousedown can appear anywhere. I guess I could just use a simple string replace and replace <img with all that I posted.

The only other trick I can think of is if there is already an onmousdown="" inside the tag, then I'd have to append my code to the end of the original, separated by semi colons, but it must be inserted before any return value.

I won't get that much into it for now.
0
 
LVL 2

Accepted Solution

by:
freeexpert earned 800 total points
ID: 19643498
> The onmousedown can appear anywhere.

For a robust solution you will have to use a HTML parser. This is not easy, because HTML is often non-standard. CEHJ's suggestion to use Neko parser should be useful.

If the output is meant for human consumption and a small error here ore there is acceptable then replacing "<img " seems like a good compromise.
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 200 total points
ID: 19644098
>>This is not easy, because HTML is often non-standard.

The Neko parser uses JTidy to clean up the html first, which normally does a pretty good job
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Whatever be the reason, if you are working on web development side,  you will need day-today validation codes like email validation, date validation , IP address validation, phone validation on any of the edit page or say at the time of registration…
Are you developing a Java application and want to create Excel Spreadsheets? You have come to the right place, this article will describe how you can create Excel Spreadsheets from a Java Application. For the purposes of this article, I will be u…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
Suggested Courses
Course of the Month14 days, 3 hours left to enroll

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question