[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

RegEx Everything up to but not including (extracting hyperlink text)

Posted on 2009-04-02
4
Medium Priority
?
900 Views
Last Modified: 2012-05-06
I have a program that I'm working on that gets a directory list from a WebDav secure server (using HttpWebResponse).  This list is returned as HTML.  I'm trying to use RegEx to get the text that appears in the hyperlink (which is the name of a folder or file).  I'm close, but not quite there...

Here's an example of what is returned on the request:

"<html><head><title>securetransfer.xxx.com - /localuser/</title></head><body><H1>securetransfer.xxx.com - /localuser/</H1><hr>  <pre><A HREF="/">[To Parent Directory]</A><br><br> 2/26/2009  2:38 PM        &lt;dir&gt; <A HREF="/localuser/Albe/">Albe</A><br>  3/5/2009  4:00 PM        &lt;dir&gt; <A HREF="/localuser/Art/">Art</A><br> 3/23/2009 12:31 PM        &lt;dir&gt; <A HREF="/localuser/Castle/">Castle</A><br> 2/19/2009  5:25 PM        &lt;dir&gt; <A HREF="/localuser/CF/">CF</A><br> 3/16/2009  8:43 PM        &lt;dir&gt; <A HREF="/localuser/CHI/">CHI</A><br> 2/19/2009  5:43 PM        &lt;dir&gt; <A HREF="/localuser/CSE/">CSE</A><br></pre><hr></body></html>"


I've come up with a RegEx expression that will parse out the complete hyperlink:
"<a.*?>.*</a>"  
     - returns <A HREF=""/localuser/Albe/"">Albe</A>

With a small adjustment to the expression, I can exclude the anchor, and just get the text with the closing bracket:
"(?<=(<a.*?>)).*</a>"
     - returns Albe</A>


I can't figure out how to get rid of the last bracket </A>.  I've tried the following, to no avail:

"(?<=(<a.*?>)).*[^</a>]"
     - returns Albe</A><br

"(?<=(<a.*?>)).*(?<=</a>)"
     - returns Albe</A>


I appreciate anything you can think of.
0
Comment
Question by:VBRocks
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 64

Accepted Solution

by:
Fernando Soto earned 2000 total points
ID: 24053710
Hi VBRocks;

Here is some sample code to do what you want.

Fernando
Imports System.Text.RegularExpressions
 
Dim xmlData As String = "<html><head><title>securetransfer.xxx.com - /localuser/</title></head><body><H1>securetransfer.xxx.com - /localuser/</H1><hr>  <pre><A HREF="" / "">[To Parent Directory]</A><br><br> 2/26/2009  2:38 PM        &lt;dir&gt; <A HREF="" / localuser / Albe / "">Albe</A><br>  3/5/2009  4:00 PM        &lt;dir&gt; <A HREF="" / localuser / Art / "">Art</A><br> 3/23/2009 12:31 PM        &lt;dir&gt; <A HREF="" / localuser / Castle / "">Castle</A><br> 2/19/2009  5:25 PM        &lt;dir&gt; <A HREF="" / localuser / CF / "">CF</A><br> 3/16/2009  8:43 PM        &lt;dir&gt; <A HREF="" / localuser / CHI / "">CHI</A><br> 2/19/2009  5:43 PM        &lt;dir&gt; <A HREF="" / localuser / CSE / "">CSE</A><br></pre><hr></body></html>"
 
Dim mc As MatchCollection = Regex.Matches(xmlData, "<[aA][^>]+>(.*?)</[aA]>")
 
If mc.Count > 0 Then
    For Each m As Match In mc
        Console.WriteLine("hyperlink text = " + m.Groups(1).Value)
    Next
End If

Open in new window

0
 
LVL 27

Author Closing Comment

by:VBRocks
ID: 31565946
Sweet!  Thank you so much for the help!  I've been researching this most of the morning.
0
 
LVL 64

Expert Comment

by:Fernando Soto
ID: 24053970
Not a problem, glad to help.  ;=)
0
 
LVL 27

Author Comment

by:VBRocks
ID: 24054650
The big tip was "Groups" (m.Groups(1).Value)  I was just using m.Value.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Parsing a CSV file is a task that we are confronted with regularly, and although there are a vast number of means to do this, as a newbie, the field can be confusing and the tools can seem complex. A simple solution to parsing a customized CSV fi…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Suggested Courses

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question