andy_ee
asked on
Regex to find location of a recurring string
Greetings,
I have a long text file with some HTML formatting in it. Within this file I need to find the start and end points of a particular string. These strings were endnotes that somehow got seen as regular text when saved as PDF and now need to be removed.
The string I'm searching for starts with
<P>US-United States industry only.
and ends with
United States industries are comparable.</P>
The kicker is that the string may or may not have leading or trailing spaces within the <P> tags. So what I want to do, is find the starting and ending locations of this string and remove the text between them
I have a long text file with some HTML formatting in it. Within this file I need to find the start and end points of a particular string. These strings were endnotes that somehow got seen as regular text when saved as PDF and now need to be removed.
The string I'm searching for starts with
<P>US-United States industry only.
and ends with
United States industries are comparable.</P>
The kicker is that the string may or may not have leading or trailing spaces within the <P> tags. So what I want to do, is find the starting and ending locations of this string and remove the text between them
ASKER
I am aware of the indexof function. My problem is that leading and trailing spaces *within* the paragraph tags.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
You are *SO* close!
I want to find and remove a string that starts with:
"<P>US-United States industry only." or "<P> US-United States industry only."
and ends with:
"United States industries are comparable.</P>" or "United States industries are comparable. </P>"
Please note the spaces after the <P> tag and before the </P> tag.
I want to find and remove a string that starts with:
"<P>US-United States industry only." or "<P> US-United States industry only."
and ends with:
"United States industries are comparable.</P>" or "United States industries are comparable. </P>"
Please note the spaces after the <P> tag and before the </P> tag.
The previous code does that exactly
ASKER
Excellent! Thanks!
http://msdn.microsoft.com/en-us/library/system.string.indexof(VS.71).aspx
and this:
http://www.cs.cf.ac.uk/Dave/C/node19.html