Solved

Regular Expression

Posted on 2008-06-26
4
255 Views
Last Modified: 2010-04-15
Hi

Any body know a nice regular expression that matches and gets <a> tags

IE if i have a document i would like to extract all the <a> assuming that the <a> has an ending tag (</a>)

Thanks

Allan
0
Comment
Question by:acadenilla
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 6

Expert Comment

by:Bruce_1975
ID: 21876377
<(?<a>\w*)>(?<text>.*)</\k<a>>

Regards,
Bruce
0
 

Author Comment

by:acadenilla
ID: 21876622
bruce

I fails when i tried a simple link

<a href='asdfasdf.com'>first tag text</a>

could you explain to me the expression

I might need to handle some crazy link ie

<a id='asdfas' onmouseclick='asdfasdf' href='asdfasdf'><font><b>asdfasdfas</b><font></a>

or

<a href='aasdfasdf'><img></img></a>

thanks
0
 
LVL 63

Accepted Solution

by:
Fernando Soto earned 250 total points
ID: 21876721
Hi acadenilla;

This pattern will give you what you want.

' Test Data in a file
Dim sr As New StreamReader("HtmlData.htm")
' Read the data into a string
Dim input As String = sr.ReadToEnd()
' Find all the Matches for the pattern "<a.*?/a>"
Dim mc As MatchCollection = Regex.Matches(input, "<a.*?/a>")
For Each m As Match In mc
    ' Display the result in the output window of the IDE
    Console.WriteLine(m.Value)
Next


Fernando
0
 
LVL 6

Expert Comment

by:Bruce_1975
ID: 21876781
Just leave away the ?<text> and use

<(?<a>\w*)>(.*)</\k<a>>

<(?<a>\w*)> check for <a followed by any alphanummeric value, hast to close with >
(.*)                any number or character is allowed, any number of repetition
</\k<a>>       has to end with </a>
0

Featured Post

SharePoint Admin?

Enable Your Employees To Focus On The Core With Intuitive Onscreen Guidance That is With You At The Moment of Need.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article is for Object-Oriented Programming (OOP) beginners. An Interface contains declarations of events, indexers, methods and/or properties. Any class which implements the Interface should provide the concrete implementation for each Inter…
Exception Handling is in the core of any application that is able to dignify its name. In this article, I'll guide you through the process of writing a DRY (Don't Repeat Yourself) Exception Handling mechanism, using Aspect Oriented Programming.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
Attackers love to prey on accounts that have privileges. Reducing privileged accounts and protecting privileged accounts therefore is paramount. Users, groups, and service accounts need to be protected to help protect the entire Active Directory …

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question