• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 571
  • Last Modified:

regex expression vb.net

Hi,

I would need to design a regex expression for my program.
So my program retrieves data from a webpage, which works :D

but offcourse only a partial of the source code is needed.

Let's say this is a part from the whole source:
<div class="lijst"><div class="kop"><h3>Zoekresultaat</h3></div><div class="oms"><ul>
<li><a href="/uwzaakleiden/graydon_zoek.jsp?btw=855104785">BOLLAERT MAURICE</a>
<div>
Ondernemingsnummer: BE0855104785
<br>Adres: LAARNESTEENWEG 102 - 9230 WETTEREN
<br>Juridische vorm: Eénmanszaak
</div>
</li></ul></div></div>

Open in new window


I would need:

BOLLAERT MAURICE
Ondernemingsnummer: BE0855104785
Adres: LAARNESTEENWEG 102 - 9230 WETTEREN
Juridische vorm: Eénmanszaak

So as you noticed, all without the <br>-tags. The thing is also, as you might have noticed, it's a list. So i just need the results from the first <li>.

Anyone can help me on this?
0
Mutsop
Asked:
Mutsop
1 Solution
 
Didier VallySystems Engineer and Finance AnalystCommented:
If the html format is always the same, you can use string parsing (that I successfully use in many projects).

A very good online regexp builder is located at :
http://myregexp.com/
0
 
MutsopAuthor Commented:
Well I'm no expert in regular expression, and hoped someone could help me solve the actual code.
Or at least a start of how to get between the li tags.
0
 
tolgaongCommented:
       Try
            Dim RegexObj As New Regex("<li>.*<a href=""[^""]*"">([^<]+)</a>\s*<div>\s*Ondernemingsnummer:\s([^\s]+)\s*<br>Adres:\s*([^\r\n]+)\s*<br>Juridische vorm:(.*)\s*</div>", RegexOptions.IgnoreCase Or RegexOptions.Multiline)
            Dim MatchResults As Match = RegexObj.Match(htmlFile)
            While MatchResults.Success
                Console.WriteLine(MatchResults.Groups(1))
                Console.WriteLine("Ondernemingsnummer: {0}", MatchResults.Groups(2))
                Console.WriteLine("Adres: {0}", MatchResults.Groups(3))
                Console.WriteLine("Juridische vorm: {0}", MatchResults.Groups(4))
                MatchResults = MatchResults.NextMatch()
            End While
        Catch ex As ArgumentException
            'Syntax error in the regular expression
        End Try
0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
käµfm³d 👽Commented:
Try this:
Dim source As String = "<div class=""lijst""><div class=""kop""><h3>Zoekresultaat</h3></div><div class=""oms""><ul>" & _
                        "<li><a href=""/uwzaakleiden/graydon_zoek.jsp?btw=855104785"">BOLLAERT MAURICE</a>" & _
                        "<div>" & _
                        "Ondernemingsnummer: BE0855104785()" & _
                        "<br>Adres: LAARNESTEENWEG 102 - 9230 WETTEREN" & _
                        "<br>Juridische vorm: Eénmanszaak" & _
                        "</div>" & _
                        "</li></ul></div></div>"
Dim matches As MatchCollection = System.Text.RegularExpressions.Regex.Matches(source, "(?<=(?s)<li>.*?>)[^<]+(?=(?s)<.*?</li>)")

For Each item As Match In matches
    If item.Value.Trim().Length > 0 Then
        MessageBox.Show(item.Value.Trim())
    End If
Next

Open in new window

0
 
MutsopAuthor Commented:
that works amazingly well!!!! :D thanks alot.
I totally forgot that for getting the quotes you should set 2 of them after each other :D
0
 
MutsopAuthor Commented:
@tolgaong

Sorry I gave the points to the wrong person... I asked a moderator to change it!
0
 
tolgaongCommented:
You solved your problem this is the main point. And the moderator will solve mine.. :)
Thanks Mutsop
0
 
MutsopAuthor Commented:
Thanks yet again and sorry for the inconvenience :)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now