Solved

Regular Expression

Posted on 2008-06-26
4
260 Views
Last Modified: 2010-04-15
Hi

Any body know a nice regular expression that matches and gets <a> tags

IE if i have a document i would like to extract all the <a> assuming that the <a> has an ending tag (</a>)

Thanks

Allan
0
Comment
Question by:acadenilla
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 6

Expert Comment

by:Bruce_1975
ID: 21876377
<(?<a>\w*)>(?<text>.*)</\k<a>>

Regards,
Bruce
0
 

Author Comment

by:acadenilla
ID: 21876622
bruce

I fails when i tried a simple link

<a href='asdfasdf.com'>first tag text</a>

could you explain to me the expression

I might need to handle some crazy link ie

<a id='asdfas' onmouseclick='asdfasdf' href='asdfasdf'><font><b>asdfasdfas</b><font></a>

or

<a href='aasdfasdf'><img></img></a>

thanks
0
 
LVL 63

Accepted Solution

by:
Fernando Soto earned 250 total points
ID: 21876721
Hi acadenilla;

This pattern will give you what you want.

' Test Data in a file
Dim sr As New StreamReader("HtmlData.htm")
' Read the data into a string
Dim input As String = sr.ReadToEnd()
' Find all the Matches for the pattern "<a.*?/a>"
Dim mc As MatchCollection = Regex.Matches(input, "<a.*?/a>")
For Each m As Match In mc
    ' Display the result in the output window of the IDE
    Console.WriteLine(m.Value)
Next


Fernando
0
 
LVL 6

Expert Comment

by:Bruce_1975
ID: 21876781
Just leave away the ?<text> and use

<(?<a>\w*)>(.*)</\k<a>>

<(?<a>\w*)> check for <a followed by any alphanummeric value, hast to close with >
(.*)                any number or character is allowed, any number of repetition
</\k<a>>       has to end with </a>
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In order to hide the "ugly" records selectors (triangles) in the rowheaders, here are some suggestions. Microsoft doesn't have a direct method/property to do it. You can only hide the rowheader column. First solution, the easy way The first sol…
This article introduced a TextBox that supports transparent background.   Introduction TextBox is the most widely used control component in GUI design. Most GUI controls do not support transparent background and more or less do not have the…
There are cases when e.g. an IT administrator wants to have full access and view into selected mailboxes on Exchange server, directly from his own email account in Outlook or Outlook Web Access. This proves useful when for example administrator want…
NetCrunch network monitor is a highly extensive platform for network monitoring and alert generation. In this video you'll see a live demo of NetCrunch with most notable features explained in a walk-through manner. You'll also get to know the philos…

696 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question