Solved

vb.net 2008 html parsing and listbox

Posted on 2009-05-04
13
332 Views
Last Modified: 2013-11-26
Im fairly new to programming, but what I want my program to do is to visit a webpage that has a bunch of links on the page containing usernames, then pull the user names from the page and add them to a list box
Heres a sample of the html

<table style="margin-left:auto;margin-right:auto;"><tr><td style="font-size:8pt;text-align:center;" class="color">
<div><b><a href="http://www.website.com/username1">
chronic</a>
</b></div>

<a href="http://www.website.com/username1">
    <img class="pic1" onmouseover="showInfo5('chronic', '', '');this.className='pic2';" onmouseout="this.className='pic1';return nd();" src="http://www.website.com/file/pic/user/chronic_75.jpg" alt="" height="75" width="56" />
</a>
</td>

</tr>
</table></td>
<td style="text-align:center;vertical-align;middle;">
<table style="margin-left:auto;margin-right:auto;"><tr><td style="font-size:8pt;text-align:center;" class="color">
<div><b><a href="http://www.website.com/username2">
dantheman2108</a>
</b></div>

I would like for it to pull the data after <a href="http://www.website.com/ and put the username into a listbox, any ideas or suggestions on how i would go about this, ive been using the web browser control

0
Comment
Question by:j0eh4x
  • 7
  • 6
13 Comments
 
LVL 15

Accepted Solution

by:
oobayly earned 500 total points
Comment Utility
Use the HttpWebRequest class to get the html on the page, then use a RegEx to extract the usernames
    Dim re As New Regex("<a href""http://www.website.com/(<?Username>/+?)""")

    Dim usernames As New List(Of String)()

    For Each m As Match In re.Matches(htmlText)

      usernames.Add(m.Groups("Username"))

    Next

Open in new window

0
 

Author Comment

by:j0eh4x
Comment Utility
having trouble with this code


under For it has "statement can not appear outside of a method body"
and under usernames, it says "declaration expected"

  Dim re As New Regex("<a href""http://www.website.com/(<?Usernamer/+?)""")

    For Each m As Match In re.Matches(htmlText)

      usernames.Add(m.Groups("Username"))

    Next

Open in new window

0
 
LVL 15

Expert Comment

by:oobayly
Comment Utility
As the compile error suggests, you need to place the for loop in a method or event handler. Also, you haven't declared the List to be populated. Finally, you misspelt Username in the Regex
Private Function PopulateUsernames(htmlText As String) As List(Of String)

  Dim re As New Regex("<a href""http://www.website.com/(<?Username>/+?)""")

  Dim usernames As New List(Of String)()

  For Each m As Match In re.Matches(htmlText)

    usernames.Add(m.Groups("Username"))

  Next

  Return usernames;

End Sub

Open in new window

0
 

Author Comment

by:j0eh4x
Comment Utility
m.Groups("Username")   "Value of type 'system.text.regularexpressions.group' cannot be converted to string'.
Imports System.net

Imports System.Text.RegularExpressions
 

Public Class Form1

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

        Dim myReq As HttpWebRequest = _

         WebRequest.Create("http://www.website.com/")

    End Sub

    Private Function PopulateUsernames(ByVal htmlText As String) As List(Of String)

        Dim re As New Regex("<a href""http://www.website.com/(<?Username>/+?)""")

        Dim usernames As New List(Of String)()

        For Each m As Match In re.Matches(htmlText)

            usernames.Add(m.Groups("Username"))

        Next

        Return usernames

    End Function

End Class

Open in new window

0
 
LVL 15

Expert Comment

by:oobayly
Comment Utility
Sorry about that last error, forgot that it should use the Value property of the match:

usernames.Add(m.Groups("Username").Value)

Open in new window

0
 

Author Comment

by:j0eh4x
Comment Utility
how would i add it to the listbox
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 15

Expert Comment

by:oobayly
Comment Utility
Assuming your listbox is called listBox1:
'' Inside the Loop

listBox1.Items.Add(m.Groups("Username").Value)

Open in new window

0
 

Author Comment

by:j0eh4x
Comment Utility
oobayly i greatly appreciate your help, i have one last dumb question
attached is the final program, on previous programs ive put my code in buttons. how do i trigger this Private Function PopulateUsernames
Imports System.Net

Imports System.Text.RegularExpressions
 

Public Class Form1

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

        Dim myReq As HttpWebRequest = _

         WebRequest.Create("http://www.website.com/browse/")

    End Sub

    Private Function PopulateUsernames(ByVal htmlText As String) As List(Of String)

        Dim re As New Regex("<a href""http://www.website.com/(<?Username>/+?)""")

        Dim usernames As New List(Of String)()

        For Each m As Match In re.Matches(htmlText)

            usernames.Add(m.Groups("Username").Value)

            '' Inside the Loop

            ListBox.Items.Add(m.Groups("Username").Value)

        Next

        Return usernames
 

    End Function

End Class

Open in new window

0
 
LVL 15

Expert Comment

by:oobayly
Comment Utility
Instead of using HttpWebRequest, use the WebClient, download the html as a string, and pass it to PopulateUsernames. The 3 lines below should go in the Load event
Dim client As New WebClient()

Dim htmlText As String = client.DownloadString("http://www.website.com/")

PopulateUsernames(htmlText)

Open in new window

0
 

Author Comment

by:j0eh4x
Comment Utility
i tried the following code but got nothing in the list box

Imports System.Net

Imports System.Text.RegularExpressions
 

Public Class Form1

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

        Dim client As New WebClient()

        Dim htmlText As String = client.DownloadString("http://www.stlpunk.com/browse")

        PopulateUsernames(htmlText)

    End Sub

    Private Function PopulateUsernames(ByVal htmlText As String) As List(Of String)

        Dim re As New Regex("<a href""http://www.stlpunk.com/(<?Username>/+?)""")

        Dim usernames As New List(Of String)()

        For Each m As Match In re.Matches(htmlText)

            usernames.Add(m.Groups("Username").Value)

            '' Inside the Loop

            ListBox.Items.Add(m.Groups("Username").Value)

        Next

        Return usernames
 

    End Function
 

End Class

Open in new window

0
 
LVL 15

Expert Comment

by:oobayly
Comment Utility
There's nothing obviously wrong with the code, so all I can suggest is adding some breakpoints and verifying that the HTML returned is valid and that some matches are being returned.
0
 

Author Comment

by:j0eh4x
Comment Utility
i used to following code to put a break in time after receiving the html then i set a textbox equal to the htmltext variable to make sure its recieving the html ok.... still not populating the list box though

Imports System.Net

Imports System.Text.RegularExpressions
 

Public Class Form1

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

        Dim client As New WebClient()

        Dim htmlText As String = client.DownloadString("http://www.stlpunk.com/browse/mode_recent/")

        Dim timeOut As DateTime = Now.AddMilliseconds(5000)'pause for 5 seconds

        Do

            Application.DoEvents()

        Loop Until Now > timeOut

        TextBox1.Text = htmlText ' display html

        PopulateUsernames(htmlText)
 
 

    End Sub

    Private Function PopulateUsernames(ByVal htmlText As String) As List(Of String)

        Dim re As New Regex("<a href""http://www.stlpunk.com/(<?Username>/+?)""")

        Dim usernames As New List(Of String)()

        For Each m As Match In re.Matches(htmlText)

            usernames.Add(m.Groups("Username").Value)

            '' Inside the Loop

            ListBox.Items.Add(m.Groups("Username").Value)

        Next

        Return usernames
 

    End Function
 

End Class

Open in new window

0
 
LVL 15

Expert Comment

by:oobayly
Comment Utility
By breakpoint I mean a debugging breakpoint, so that you can inspect the code returned by the WebClient.
You don't need to block the Load event using the loop as DownloadString will block until it returns a string (or throws an exception).

http://msdn.microsoft.com/en-us/library/ktf38f66(VS.71).aspx
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

More often than not, we developers are confronted with a need: a need to make some kind of magic happen via code. Whether it is for a client, for the boss, or for our own personal projects, the need must be satisfied. Most of the time, the Framework…
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now