Link to home
Start Free TrialLog in
Avatar of stakor
stakorFlag for United States of America

asked on

reg ex search for companies

I have a large html file that I am trying to scrape the names of companies out of. The company names are always in the following format:

<a href="offsite_quotes.asp?content=http://www.someplace.com">Some Place, Inc.</a>

I would want "Some Place, Inc." as the result here. The company names could be one or more words, they might even have special characters in the name. (@, -, etc) But they will always have "<a href="offsite_quotes.asp?content=" followed by a url and a "">", then the name of the company.

There might be more than one company name per line. If there is, it would be important to print each one per line. I don't know if doing this with a open file, and while loop would be the way to go or not.
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of stakor

ASKER

Thank you very much.