Link to home
Start Free TrialLog in
Avatar of sharingsunshine
sharingsunshineFlag for United States of America

asked on

What regex will remove duplicate rel="nofolow" tags?

I had this question after viewing Python error - Need Help.

I created this regex to remove the duplicate rel="nofollow" tags using grep in TextWrangler but I am not clear how to add this into the Python regex code.

rel="nofollow"(\s|\n|\n\r)rel="nofollow"

Open in new window


replace with
rel="nofollow"

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of pepr
pepr

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pepr
pepr

I have noticed a bug in the original page:
<a 1="" href="http://www.theherbsplace.com/" imageanchor=" rel="nofollow" style="...

Open in new window


Notice the 1="" and the imageanchor=" without the enclosing double quote.
Avatar of sharingsunshine

ASKER

Thanks for the help.  On the other exceptions you pointed out I will just have to fix them as I find them.