I will suggest to use a HTML parser instead of regex!
Main Topics
Browse All TopicsI'm trying to strip down this code so the result is CHROME SALON... I need to get rid of everything else including tabs, whitespaces etc... I've been fooling around with some reg ex but I'm really new to this and it's driving me crazy... I figured out how to do everything seperatly but I'm not sure how to group those together so in one pass it will strip everything out... can you help me out and please explain how you constructed the regex... thanks a lot:
<td class="lName" colspan="2">
CHROME SALON
</td>
<td class="lPhone" align="right">
my reg ex: (<td\sclass="lName"\scolsp
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Business Accounts
Answer for Membership
by: dbkrugerPosted on 2005-09-20 at 21:27:03ID: 14925705
You want to find the tag <td class=....> , skip the space, and remember the text.
n="2">
pan="2">
pan="2">\s *([^<]*)
pan="2">\s *([^<]*)</ td>\s*<td class="lPhone" align="right">
Your first part is right, but ditch the parens. They are only necessary if you want to remember that part of the pattern.
<td\sclass="lName"\scolspa
I would generalize this slightly in case there are more than one space (perfectly legal).
<td\s+class="lName"\s+cols
Then you skip optional spaces. That's when things get nutty. You then want to remember the word, but instead you put in a tab \t, which has already been matched by the \s, guaranteeing failure. I'm going to keep it simple by going until the first angle bracket -- ie <
<td\s+class="lName"\s+cols
This would, frankly do it. but you can throw on the stuff afterward if you want. Don't put more than you have to, or you will end up eliminating things you didn't think about. Complicated patterns are confusing, as you've learned!
<td\s+class="lName"\s+cols