asked on

Regex to find location of a recurring string

Greetings,
I have a long text file with some HTML formatting in it. Within this file I need to find the start and end points of a particular string. These strings were endnotes that somehow got seen as regular text when saved as PDF and now need to be removed.

The string I'm searching for starts with
US-United States industry only.

and ends with
United States industries are comparable.

The kicker is that the string may or may not have leading or trailing spaces within the tags. So what I want to do, is find the starting and ending locations of this string and remove the text between them

David L. Hansen

Try this:
http://msdn.microsoft.com/en-us/library/system.string.indexof(VS.71).aspx

and this:
http://www.cs.cf.ac.uk/Dave/C/node19.html

andy_ee

ASKER

I am aware of the indexof function. My problem is that leading and trailing spaces *within* the paragraph tags.

ASKER CERTIFIED SOLUTION

iHadi

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

andy_ee

ASKER

You are *SO* close!

I want to find and remove a string that starts with:
"US-United States industry only." or " US-United States industry only."

and ends with:
"United States industries are comparable." or "United States industries are comparable. "

Please note the spaces after the tag and before the tag.

iHadi

The previous code does that exactly

andy_ee

ASKER

Excellent! Thanks!