Use VB to Search for HTML String and Return Values to text file

Need to have VB app Search a entered url and look for a specific string in the html and return the values to a text file.
JoseDavilaAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
David LeeConnect With a Mentor Commented:
Try this.  Without a working example of the page to test with the best I can do is to grab the entire line of any line containing a # in the text portion.  Unless you can assure me that evey line with a # in it is going to be formatted exactly like the example you gave (<td bgcolor="red"> 4.793E+01 <B>#</B></td>)) I can't reliable parse out the number portion.  When you change the URL below to your URL do not include the http:// portion.

Private Sub cmdGo_Click()
    Dim objDocument As Object, _
        objElements As Object, _
        objItem As Object, _
        lngLineCount As Long
    WebBrowser1.Navigate "www.usatoday.com"
    While WebBrowser1.ReadyState <> READYSTATE_COMPLETE
        DoEvents
    Wend
    Open "C:\Output.Txt" For Output As #1
    Set objDocument = WebBrowser1.Document
    Set objElements = objDocument.All
    lngLineCount = 1
    For Each objItem In objElements
        If InStr(1, objItem.innertext, "#", vbTextCompare) > 0 Then
            Print #1, "Line" & lngLineCount & "=" & objItem.innertext
        End If
        lngLineCount = lngLineCount + 1
    Next
    Close #1
End Sub
0
 
Bob LambersonSoftware EngineerCommented:
Hi JoseDavila,
Will the url be entered on a form? in a text box?

Bob
0
 
JoseDavilaAuthor Commented:
The URL will be submitted in the form with a text box.
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
Bob LambersonSoftware EngineerCommented:
JoseDavila,

Open a form in vb and add two textboxes and a command button.

Add this code to a form and run it.
Enter the search string in the second text box and click on search.
You will find a textfile.txt in c:\ that contains the found string.
Option Explicit


Private Sub cmdSearch_Click()
   If InStr(1, Text1.Text, Text2.Text) > 0 Then
      Open "C:\textfile.txt" For Output As #1
      Write #1, Text2.Text
      Close #1
   End If
End Sub

Private Sub Form_Load()
Text1.Text = "http://www.bluesquirrel.com/products/PopUpStopper/"
End Sub

Bob
0
 
JoseDavilaAuthor Commented:
Thank you for the quick response but I actually need the text2 to compare to the html source code not the url.  

0
 
anvCommented:
check this

Dim doc As HTMLDocument
Dim a_link As HTMLAnchorElement
Dim txt As String

    ' List the links.
    On Error Resume Next
    Set doc = WebBrowser1.Document   'Assuming the document is being displayed in the webbrowserr control placed on the form..


    For Each a_link In doc.links
        txt = txt & a_link.href & vbCrLf
    Next a_link

    txtLinks.Text = txt

0
 
Bob LambersonSoftware EngineerCommented:
JoseDavila,
This object will bring back the html source of a page.

http://www.serverobjects.com/comp/asphttp3.htm

Bob
0
 
David LeeCommented:
The code below will load a page and then search through that page searching for text matching a given search string.  You don't mention what you want to do once a match is found, so I opted to pop a dialog box up indicating the item was found.  What I've provided here is a complete VB form (.frm) file.  To use it, copy and paste the code into Notepad and save the file with a .frm extension.  Then open the form in VB.  The web site I used is USA Today, so edit the url and change it to the site you are interested in.  Run the program, type the text you want to search for into the textbox, and click Go.  The text search is case sensitive.  On clicking Go the code will grab a copy of the HTML document and llop through the various elements on the page looking for the search text.

Hope this helps.


VERSION 5.00
Object = "{EAB22AC0-30C1-11CF-A7EB-0000C05BAE0B}#1.1#0"; "shdocvw.dll"
Begin VB.Form Form1
   Caption         =   "Form1"
   ClientHeight    =   3885
   ClientLeft      =   60
   ClientTop       =   450
   ClientWidth     =   7560
   LinkTopic       =   "Form1"
   ScaleHeight     =   3885
   ScaleWidth      =   7560
   StartUpPosition =   3  'Windows Default
   Begin VB.TextBox txtSearch
      Height          =   375
      Left            =   120
      TabIndex        =   2
      Top             =   2640
      Width           =   7335
   End
   Begin SHDocVwCtl.WebBrowser WebBrowser1
      Height          =   2295
      Left            =   120
      TabIndex        =   1
      Top             =   120
      Width           =   7335
      ExtentX         =   12938
      ExtentY         =   4048
      ViewMode        =   0
      Offline         =   0
      Silent          =   0
      RegisterAsBrowser=   0
      RegisterAsDropTarget=   1
      AutoArrange     =   0   'False
      NoClientEdge    =   0   'False
      AlignLeft       =   0   'False
      NoWebView       =   0   'False
      HideFileNames   =   0   'False
      SingleClick     =   0   'False
      SingleSelection =   0   'False
      NoFolders       =   0   'False
      Transparent     =   0   'False
      ViewID          =   "{0057D0E0-3573-11CF-AE69-08002B2E1262}"
      Location        =   ""
   End
   Begin VB.CommandButton cmdGo
      Caption         =   "Go"
      Height          =   495
      Left            =   120
      TabIndex        =   0
      Top             =   3240
      Width           =   1455
   End
End
Attribute VB_Name = "Form1"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Private Sub cmdGo_Click()
    Dim objDocument As Object, _
        objElements As Object, _
        objItem As Object
    If txtSearch.Text <> "" Then
        WebBrowser1.Navigate "www.usatoday.com"
        While WebBrowser1.ReadyState <> READYSTATE_COMPLETE
            DoEvents
        Wend
        Set objDocument = WebBrowser1.Document
        Set objElements = objDocument.All
        For Each objItem In objElements
            If InStr(1, objItem.innerHTML, txtSearch.Text, vbTextCompare) > 0 Then
                MsgBox "Item found.", vbInformation, "Found It"
            End If
        Next
    End If
End Sub
0
 
BurbbleCommented:
Instead of using a Web Browser control, you might prefer the Microsoft Internet Transfer control (Inet).

Private Sub Form_Load()
    Dim strHTML As String
    strHTML = Inet1.OpenURL("http://www.experts-exchange.com", icString)
    MsgBox strHTML
End Sub

-Burbble
0
 
JoseDavilaAuthor Commented:
BlueDevilFan,

I would like the output to be in a textfile not a message box.
0
 
JoseDavilaAuthor Commented:
I would like the search to look into the html source for # and return the following entry into a text file. (Example
<td bgcolor="red"> 4.793E+01 <B>#</B></td>)
0
 
David LeeCommented:
JoseDavila,

Can you be more specific about what you want returned?  Also, do you want to search for just the first # sign, or all pound signs on the page?  
0
 
JoseDavilaAuthor Commented:
Search for all pound signs on the page and return the number in the following example (<td bgcolor="red"> 4.793E+01 <B>#</B></td>)
0
 
David LeeCommented:
Can you give me a URL for a page I can test against?  If not, can you copy and paste the HTML for a sample page here?
0
 
JoseDavilaAuthor Commented:
That all I am able to provide.
0
 
BurbbleCommented:
>> I would like the output to be in a textfile not a message box.

I was just giving an alternative solution to using a Web Browser Control for retrieving a file from a URL -- the Microsoft Internet Transfer Control's .OpenURL method.

I don't entirely understand how you want to parse it, though...

-Burbble
0
 
JoseDavilaAuthor Commented:
So I need load the webpage that needs parse and have my code find where every # exists in the web page.  (Example of HTML source from table (<td bgcolor="red"> 4.793E+01 <B>#</B></td>)) Once the code has found # char I would like it to recode the number (Which in this example is 4.793E+01) and write the output to a text file.
0
 
JoseDavilaAuthor Commented:
Does anybody have input on how to make this code work.  'Input as to where the item was found in the html source and write the text 10 character to left to a text file.

Private Sub cmdGo_Click()
    Dim objDocument As Object, _
        objElements As Object, _
        objItem As Object
    Dim myResults As String

    If txtSearch.Text <> "" Then
        WebBrowser1.Navigate "http://www.yahoo.com/"
            DoEvents
        Wend
        Set objDocument = WebBrowser1.Document
        Set objElements = objDocument.All
        For Each objItem In objElements
            If InStr(1, objItem.innerHTML, txtSearch.Text, vbTextCompare) > 0 Then
                myResults = 'Need Input as to where the item was found and record the text 10 character to left & vbClrf
                Open "C:\output.txt" For Output As #1
                Write #1, myResults
                Close #1
            End If
        Next
    End If
End Sub
0
 
David LeeCommented:
You've left out part of the code, JoseDavila.  The way it is now it probably won't work because you're not letting the web page load before it moves on to searching for the text you want.  Also, the way you've set the code up to write the output to the text file you're only going to get the alst itteration, not every itteration.  Hang on a minute and I'll modify my post to search for what you've asked for.  Note though that since I don't have an example to test with I can't say for certain it'll work as you want.
0
 
anvCommented:
hi

for following

>> For Each objItem In objElements
>>            If InStr(1, objItem.innerHTML, txtSearch.Text, vbTextCompare) > 0 Then
>>                myResults = 'Need Input as to where the item was found and record the >>text 10 character to left & vbClrf
>>                Open "C:\output.txt" For Output As #1
>>                Write #1, myResults
>>                Close #1
>>            End If
>>        Next

try this

For Each objItem In objElements
   ind=InStr(1, objItem.innerHTML, txtSearch.Text, vbTextCompare)
  If ind > 0 then                      
         myResults =  ind &  Mid(objItem.innerHTML, ind - 10, 10) &  vbClrf
         Open "C:\output.txt" For Output As #1
         Write #1, myResults
         Close #1
   End If
Next

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.