[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 547
  • Last Modified:

regex expression vb.net

Hi,

I would need to design a regex expression for my program.
So my program retrieves data from a webpage, which works :D

but offcourse only a partial of the source code is needed.

Let's say this is a part from the whole source:
<div class="lijst"><div class="kop"><h3>Zoekresultaat</h3></div><div class="oms"><ul>
<li><a href="/uwzaakleiden/graydon_zoek.jsp?btw=855104785">BOLLAERT MAURICE</a>
<div>
Ondernemingsnummer: BE0855104785
<br>Adres: LAARNESTEENWEG 102 - 9230 WETTEREN
<br>Juridische vorm: Eénmanszaak
</div>
</li></ul></div></div>

Open in new window


I would need:

BOLLAERT MAURICE
Ondernemingsnummer: BE0855104785
Adres: LAARNESTEENWEG 102 - 9230 WETTEREN
Juridische vorm: Eénmanszaak

So as you noticed, all without the <br>-tags. The thing is also, as you might have noticed, it's a list. So i just need the results from the first <li>.

Anyone can help me on this?
0
Mutsop
Asked:
Mutsop
1 Solution
 
Didier VallySystems Engineer and Finance AnalystCommented:
If the html format is always the same, you can use string parsing (that I successfully use in many projects).

A very good online regexp builder is located at :
http://myregexp.com/
0
 
MutsopAuthor Commented:
Well I'm no expert in regular expression, and hoped someone could help me solve the actual code.
Or at least a start of how to get between the li tags.
0
 
tolgaongCommented:
       Try
            Dim RegexObj As New Regex("<li>.*<a href=""[^""]*"">([^<]+)</a>\s*<div>\s*Ondernemingsnummer:\s([^\s]+)\s*<br>Adres:\s*([^\r\n]+)\s*<br>Juridische vorm:(.*)\s*</div>", RegexOptions.IgnoreCase Or RegexOptions.Multiline)
            Dim MatchResults As Match = RegexObj.Match(htmlFile)
            While MatchResults.Success
                Console.WriteLine(MatchResults.Groups(1))
                Console.WriteLine("Ondernemingsnummer: {0}", MatchResults.Groups(2))
                Console.WriteLine("Adres: {0}", MatchResults.Groups(3))
                Console.WriteLine("Juridische vorm: {0}", MatchResults.Groups(4))
                MatchResults = MatchResults.NextMatch()
            End While
        Catch ex As ArgumentException
            'Syntax error in the regular expression
        End Try
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
käµfm³d 👽Commented:
Try this:
Dim source As String = "<div class=""lijst""><div class=""kop""><h3>Zoekresultaat</h3></div><div class=""oms""><ul>" & _
                        "<li><a href=""/uwzaakleiden/graydon_zoek.jsp?btw=855104785"">BOLLAERT MAURICE</a>" & _
                        "<div>" & _
                        "Ondernemingsnummer: BE0855104785()" & _
                        "<br>Adres: LAARNESTEENWEG 102 - 9230 WETTEREN" & _
                        "<br>Juridische vorm: Eénmanszaak" & _
                        "</div>" & _
                        "</li></ul></div></div>"
Dim matches As MatchCollection = System.Text.RegularExpressions.Regex.Matches(source, "(?<=(?s)<li>.*?>)[^<]+(?=(?s)<.*?</li>)")

For Each item As Match In matches
    If item.Value.Trim().Length > 0 Then
        MessageBox.Show(item.Value.Trim())
    End If
Next

Open in new window

0
 
MutsopAuthor Commented:
that works amazingly well!!!! :D thanks alot.
I totally forgot that for getting the quotes you should set 2 of them after each other :D
0
 
MutsopAuthor Commented:
@tolgaong

Sorry I gave the points to the wrong person... I asked a moderator to change it!
0
 
tolgaongCommented:
You solved your problem this is the main point. And the moderator will solve mine.. :)
Thanks Mutsop
0
 
MutsopAuthor Commented:
Thanks yet again and sorry for the inconvenience :)
0

Featured Post

 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

Tackle projects and never again get stuck behind a technical roadblock.
Join Now