Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 281
  • Last Modified:

VB .NET 2008 HTML Source Regex

I have the following HTML document and I need to pull the text from the <p> tag from the code below.  How can I solve this using regular expressions to pick up "text here" and save it to a string.
<div id="abc" class="container container-a">
<div class="desc">
<p>text here</p>
</div>

Open in new window

0
disrupt
Asked:
disrupt
  • 3
  • 2
1 Solution
 
käµfm³d 👽Commented:
Try the following. It should handle just <p>...</p>, <p attributes="">...</p>, and either of the previous with nested tags--except nested <p> tags.
Imports System.Text.RegularExpressions

...

Dim m As Match = Regex.Match(source_text, "(?<=<p(?:>| [^>]*>))(?:[^<]|<(?!/p>))*(?=</p>)")
Dim text As String

If m.Success Then
    text = m.Value
End If

Open in new window

0
 
käµfm³d 👽Commented:
If you are just going to have <p>...</p> with no nested tags and you want to simplify the pattern, you can use:
Imports System.Text.RegularExpressions

...

Dim m As Match = Regex.Match(source_text, "(?<=<p>)[^<]*(?=</p>)")
Dim text As String

If m.Success Then
    text = m.Value
End If

Open in new window

0
 
disruptAuthor Commented:
If i have multiple <p> tags how could I handle that?
0
 
disruptAuthor Commented:
like if i wanted it to loop through all the <p> tags:

I tried doing a MatchCollection but no luck :/
0
 
käµfm³d 👽Commented:
If i have multiple <p> tags how could I handle that?
Multiple or nested?

If you do mean "multiple," then MatchCollection is what you need.

Imports System.Text.RegularExpressions

Module Module1

    Sub Main()
        Dim source_text As String = "<p>The quick red fox jumps over the lazy dog.</p> What a lazy dog I thought. <p class=""par2"">The lazy dog did not move all day.</p>"

        Dim matches As MatchCollection = Regex.Matches(source_text, "(?<=<p(?:>| [^>]*>))(?:[^<]|<(?!/?p>))*(?=</p>)")

        For Each m As Match In matches
            Console.WriteLine(m.Value)
        Next

        Console.ReadKey()
    End Sub

End Module

Open in new window


Output
The quick red fox jumps over the lazy dog.
The lazy dog did not move all day.

Open in new window

0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now