Link to home
Start Free TrialLog in
Avatar of CAS-IT
CAS-IT

asked on

RegEx to match style="something" ??

Hi guys,

I'm trying to make a regex that removes in-line styles from HTML. I.e. <div style="color: red"> becomes <div>.

I also want to make sure that this matches any kind of messed-up HTML it receives, like whitespace in the middle of the thing, single quotes or double quotes, uppercase STYLE or lowercase, or something like sTyLe or whatnot. It needs to be fool-proof.

Here's what I've got:

/style\s*=\s*('|").*('|")/gi

The i takes care of the case, so we can start with /style/i

Then, I need to make sure we match whitespace between style and the = sign. So I add \s*.

Then I match the = sign.

Then I match more whitespace with \s*

Then I need to match the first quote, so I've used ('|") to match single or double quotes.

Then, we match anything, for which I use .*

Then we match the closing quote with another ('|").

Ok. So the problem here is the "anything" part. When this runs, it not only matches

style="color: red"

But it also matches

style="color: red">Hey guys! what'

So, it can grab HTML that it's not supposed to.

But, I can't just say stop at the first quote, because you can have valid in-lie styles that contain quotes. Like:

style="background: url('mygraphic.jpg') top left no-repeat);"

So, this one has a ' in it. So it would match.

I'm stuck!!

ASKER CERTIFIED SOLUTION
Avatar of Marco Gasi
Marco Gasi
Flag of Spain image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
This appears to be working fine


str.replace("(div) ([^>]*)>","$1>$2")

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of CAS-IT
CAS-IT

ASKER

Accepted my own post because I ended up using a combination of the two RegExs presented.
The problem with using dot-star (or even dot-star-question) is that if you don't have a closing quotation mark, then your pattern is going to consume up to the next available quotation mark. If your HTML is properly structured, then this should be inconsequential; if it's not, then you are going to have problems in your replacement.
Avatar of CAS-IT

ASKER

You're right.

Hm...

I dunno. I just need to trust it. If I run into that problem I'll open another question!