We help IT Professionals succeed at work.

RegEx replacement of nested HTML Tag

MattMarx asked
Last Modified: 2012-08-13
I am currently using some out of the box forum software.  A user has the option to quote an existing message, and add their own message to it.

The quoted message is wrapped in a tag as follows:

<DIV class=msgQuoteWrap>
  here is the quoted message ...

Here is the text that the user can enter in....

Here is the issue.  I would like to write a RegEx expression so that I can STRIP the entire <DIV class-msgQuoteWrap> tag, including all of its contents.  The problem is that the quoted message can include its own DIV tags, making the matching difficult.

For example, this is a typical message:

<DIV class=msgQuoteWrap>
  This is a message
<DIV class=msgInner>
  <DIV class=msgQuoteWrap>  this is a nested quote from a pervious message </DIV>
<DIV class=class2>this is another style for this text</DIV>

You get my point.  There could be ten or twenty nested DIV tags within the outer tag.  I would like to take the outer tag, and strip all inner contents.

I'm using VB.Net.  can this be done with a RegEx expression?

Watch Question

i'm not sure i understand what dou you want in the regular expression?


I'd like to match the outermost tag, so that I can replace with nothing "".   Is there a better way to do this?

Ok telle me if i'm wrong.

What you want to do is have text like:

First Message blablalblablalblabala

                  The answer to this message blablalballbal

Like here?: http://www.dotnet247.com/247reference/msgs/32/163740.aspx

Higor:  Did you post the wrong link?  That link's about viewstate.

I believe MattMarx' problem is that a simple match with "<div>" at the beginning and "</div>" at the end, of
will match:
but will omit the last ....</div>
In other words it will match the first <div> with the first </div> that follows it instead of matching with the properly nested </div>.
The problem is the nesting.

This is not too difficult in regex with only a few nesting levels (I think), but handling an arbitrary number of levels, 10 to 20 or more, is difficult.

MattMarx:  Is that about right?


That is correct!  I have no way to know how many nesting levels there will be.  It doesn't have to use RegEx.  Is there some way this can be done in VB.Net even without RegEx?
This one is on us!
(Get your first solution completely free - no credit card required)
Most Valuable Expert 2012
Top Expert 2008

Are you still having a problem with this?

Bob "Cleanup Volunteer"

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.


Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.