VBRocks
asked on
RegEx Everything up to but not including (extracting hyperlink text)
I have a program that I'm working on that gets a directory list from a WebDav secure server (using HttpWebResponse). This list is returned as HTML. I'm trying to use RegEx to get the text that appears in the hyperlink (which is the name of a folder or file). I'm close, but not quite there...
Here's an example of what is returned on the request:
"<html><head><title>secure transfer.x xx.com - /localuser/</title></head> <body><H1> securetran sfer.xxx.c om - /localuser/</H1><hr> <pre><A HREF="/">[To Parent Directory]</A><br><br> 2/26/2009 2:38 PM <dir> <A HREF="/localuser/Albe/">Al be</A><br> 3/5/2009 4:00 PM <dir> <A HREF="/localuser/Art/">Art </A><br> 3/23/2009 12:31 PM <dir> <A HREF="/localuser/Castle/"> Castle</A> <br> 2/19/2009 5:25 PM <dir> <A HREF="/localuser/CF/">CF</ A><br> 3/16/2009 8:43 PM <dir> <A HREF="/localuser/CHI/">CHI </A><br> 2/19/2009 5:43 PM <dir> <A HREF="/localuser/CSE/">CSE </A><br></ pre><hr></ body></htm l>"
I've come up with a RegEx expression that will parse out the complete hyperlink:
"<a.*?>.*</a>"
- returns <A HREF=""/localuser/Albe/""> Albe</A>
With a small adjustment to the expression, I can exclude the anchor, and just get the text with the closing bracket:
"(?<=(<a.*?>)).*</a>"
- returns Albe</A>
I can't figure out how to get rid of the last bracket </A>. I've tried the following, to no avail:
"(?<=(<a.*?>)).*[^</a>]"
- returns Albe</A><br
"(?<=(<a.*?>)).*(?<=</a>)"
- returns Albe</A>
I appreciate anything you can think of.
Here's an example of what is returned on the request:
"<html><head><title>secure
I've come up with a RegEx expression that will parse out the complete hyperlink:
"<a.*?>.*</a>"
- returns <A HREF=""/localuser/Albe/"">
With a small adjustment to the expression, I can exclude the anchor, and just get the text with the closing bracket:
"(?<=(<a.*?>)).*</a>"
- returns Albe</A>
I can't figure out how to get rid of the last bracket </A>. I've tried the following, to no avail:
"(?<=(<a.*?>)).*[^</a>]"
- returns Albe</A><br
"(?<=(<a.*?>)).*(?<=</a>)"
- returns Albe</A>
I appreciate anything you can think of.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Not a problem, glad to help. ;=)
ASKER
The big tip was "Groups" (m.Groups(1).Value) I was just using m.Value.
ASKER