Solved

Regular expression to exclude HTML tags from text replacement

Posted on 2004-09-28
7
537 Views
Last Modified: 2011-09-20
i need to create a regular expression (in Java) to replace sections of text that are NOT in <span> HTML tags. basically i am replacing a set of terms with a block of text inside span tags, but i don't want the terms inside those blocks of text to be matched as well. so i only want to replace terms that aren't inside <span> tags.

My regular expression so far is:

((<span.*/span>)*^(<span)) ($term)

where $term is the term that i want to replace. basically the regular expression needs to say "search for all instances of $term that are not between an open and end <span> tag (and replace with the defined text)".

please ask if i haven't explained this adequately and i can explain the situation more clearly.

any help would be appreciated,

Stafford

p.s. i'm not all that good at Regular Expressions, so please excuse the one above ;)
0
Comment
Question by:staffordvaughan
  • 3
  • 3
7 Comments
 
LVL 86

Accepted Solution

by:
CEHJ earned 250 total points
ID: 12177540
Regular expressions are not particularly good for *not* matching things. What i'd suggest is the something like the following:

final String RE_ANY_TAG = "(<[A-Za-z\\-]+>)([^<]+)(<[/A-Za-z\\-]+>)";

then check the first group does *not* contain "<span>"

0
 

Author Comment

by:staffordvaughan
ID: 12177576
thanks for your response, but can you be more specific please?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12177623
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:staffordvaughan
ID: 12187223
the question is still open, i would like for someone who is good at regular expressions to actually post the regular expression solution if possible. i'm familiar with regular expressions in general, but the actual solution to this question is eluding me.

or perhaps suggest a more appropriate category to post this question on? i could not find a better one than this.

thanks
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12188488
>>to actually post the regular expression solution if possible

That's quite a bit of work! What are talking about specifically here:

>>
basically i am replacing a set of terms with a block of text inside span tags, but i don't want the terms inside those blocks of text to be matched as well. so i only want to replace terms that aren't inside <span> tags.
>>
0
 

Author Comment

by:staffordvaughan
ID: 12400018
i accepted the first response as the answer even though it wasn't really what i was after. as you say though, the specific regular expression is probably a lot of work and my explanation might not have been sufficient. thanks for your help anyway, as it turned out i used a different (non regular expression) method for doing the task because i just couldn't work out the regular expression.

thanks,
Stafford
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
json example 39 130
Eclipse Neon and jdk 1.8.0 11 126
base64 decode encode 12 119
couple of eclipse 5 16
An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…

930 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now