Regex to find location of a recurring string
Posted on 2009-05-07
I have a long text file with some HTML formatting in it. Within this file I need to find the start and end points of a particular string. These strings were endnotes that somehow got seen as regular text when saved as PDF and now need to be removed.
The string I'm searching for starts with
<P>US-United States industry only.
and ends with
United States industries are comparable.</P>
The kicker is that the string may or may not have leading or trailing spaces within the <P> tags. So what I want to do, is find the starting and ending locations of this string and remove the text between them