Hello all, I hope things at this time are going OK for you and your families. I am working on a project, and I need some help with regular expressions. I am trying to create a regular expression that can remove certain types of HTML tags, another one that will ignore them and lastly, one that selects both types. I have made some attempts. However, there is always something missing. I am learning more about regular expressions in the process.
From here on out, I will refer to regular expressions as regex. I want one regex that catches all words surrounded by three types of tags. I only need these three. The tags I need to make part of my pattern are <em>, <span>, and <strong>. The <span> is the only one with attributes like, e.g. <span style="color: #3a9ee3;"> I also need to select the closing tags for each. In the example, I show I am highlighting what I want. However, I don’t think I am doing this in the best way.
At times I will need an inverted version of the request above. I need to select words that are not wrapped in any tag. Coming up with the correct regex for this has been more difficult. I seem always to be selecting something I do not want. I am always selecting an unexpected tag or a semicolon or something near the word. I sometimes need to select open and closed parenthesis. I have not been able to make the closed one work one time.
I need the last regex to select the words between the tags and those without any tag wrapped around. I also need the tags selected for the ones that have them.
Could somebody help construct this type of regex?
Here is a link to test at: https://regex101.com/r/5VUKsi/1