regular expression replacements
Posted on 2005-04-24
I have a html document in a string, and I want to remove some tags from it. There are two basic cases
1. remove a simple tag. examples:
example 1.1 <sometag1 attribute="value"> needs to be removed.
example 1.2 </sometag1> needs to be removed (if it exists)
example 1.3 <sometag /> needs to be removed
2. remove tags and everything betweem. example:
example 2.1 <sometag2>blah dont worry there are no sometag2s here blah</sometag2> needs to be removed entirely.
In this case all instances of sometag1 and sometag2 can be removed, allthough it would be better to have a solution that removes only those that are between the HEAD tags.