Solved

Use VB to Search for HTML String and Return Values to text file

Posted on 2004-10-10
20
242 Views
Last Modified: 2010-05-02
Need to have VB app Search a entered url and look for a specific string in the html and return the values to a text file.
0
Comment
Question by:JoseDavila
  • 8
  • 5
  • 3
  • +2
20 Comments
 
LVL 12

Expert Comment

by:BobLamberson
ID: 12273273
Hi JoseDavila,
Will the url be entered on a form? in a text box?

Bob
0
 

Author Comment

by:JoseDavila
ID: 12273335
The URL will be submitted in the form with a text box.
0
 
LVL 12

Expert Comment

by:BobLamberson
ID: 12273456
JoseDavila,

Open a form in vb and add two textboxes and a command button.

Add this code to a form and run it.
Enter the search string in the second text box and click on search.
You will find a textfile.txt in c:\ that contains the found string.
Option Explicit


Private Sub cmdSearch_Click()
   If InStr(1, Text1.Text, Text2.Text) > 0 Then
      Open "C:\textfile.txt" For Output As #1
      Write #1, Text2.Text
      Close #1
   End If
End Sub

Private Sub Form_Load()
Text1.Text = "http://www.bluesquirrel.com/products/PopUpStopper/"
End Sub

Bob
0
 

Author Comment

by:JoseDavila
ID: 12274065
Thank you for the quick response but I actually need the text2 to compare to the html source code not the url.  

0
 
LVL 10

Expert Comment

by:anv
ID: 12275423
check this

Dim doc As HTMLDocument
Dim a_link As HTMLAnchorElement
Dim txt As String

    ' List the links.
    On Error Resume Next
    Set doc = WebBrowser1.Document   'Assuming the document is being displayed in the webbrowserr control placed on the form..


    For Each a_link In doc.links
        txt = txt & a_link.href & vbCrLf
    Next a_link

    txtLinks.Text = txt

0
 
LVL 12

Expert Comment

by:BobLamberson
ID: 12276029
JoseDavila,
This object will bring back the html source of a page.

http://www.serverobjects.com/comp/asphttp3.htm

Bob
0
 
LVL 76

Expert Comment

by:David Lee
ID: 12276047
The code below will load a page and then search through that page searching for text matching a given search string.  You don't mention what you want to do once a match is found, so I opted to pop a dialog box up indicating the item was found.  What I've provided here is a complete VB form (.frm) file.  To use it, copy and paste the code into Notepad and save the file with a .frm extension.  Then open the form in VB.  The web site I used is USA Today, so edit the url and change it to the site you are interested in.  Run the program, type the text you want to search for into the textbox, and click Go.  The text search is case sensitive.  On clicking Go the code will grab a copy of the HTML document and llop through the various elements on the page looking for the search text.

Hope this helps.


VERSION 5.00
Object = "{EAB22AC0-30C1-11CF-A7EB-0000C05BAE0B}#1.1#0"; "shdocvw.dll"
Begin VB.Form Form1
   Caption         =   "Form1"
   ClientHeight    =   3885
   ClientLeft      =   60
   ClientTop       =   450
   ClientWidth     =   7560
   LinkTopic       =   "Form1"
   ScaleHeight     =   3885
   ScaleWidth      =   7560
   StartUpPosition =   3  'Windows Default
   Begin VB.TextBox txtSearch
      Height          =   375
      Left            =   120
      TabIndex        =   2
      Top             =   2640
      Width           =   7335
   End
   Begin SHDocVwCtl.WebBrowser WebBrowser1
      Height          =   2295
      Left            =   120
      TabIndex        =   1
      Top             =   120
      Width           =   7335
      ExtentX         =   12938
      ExtentY         =   4048
      ViewMode        =   0
      Offline         =   0
      Silent          =   0
      RegisterAsBrowser=   0
      RegisterAsDropTarget=   1
      AutoArrange     =   0   'False
      NoClientEdge    =   0   'False
      AlignLeft       =   0   'False
      NoWebView       =   0   'False
      HideFileNames   =   0   'False
      SingleClick     =   0   'False
      SingleSelection =   0   'False
      NoFolders       =   0   'False
      Transparent     =   0   'False
      ViewID          =   "{0057D0E0-3573-11CF-AE69-08002B2E1262}"
      Location        =   ""
   End
   Begin VB.CommandButton cmdGo
      Caption         =   "Go"
      Height          =   495
      Left            =   120
      TabIndex        =   0
      Top             =   3240
      Width           =   1455
   End
End
Attribute VB_Name = "Form1"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Private Sub cmdGo_Click()
    Dim objDocument As Object, _
        objElements As Object, _
        objItem As Object
    If txtSearch.Text <> "" Then
        WebBrowser1.Navigate "www.usatoday.com"
        While WebBrowser1.ReadyState <> READYSTATE_COMPLETE
            DoEvents
        Wend
        Set objDocument = WebBrowser1.Document
        Set objElements = objDocument.All
        For Each objItem In objElements
            If InStr(1, objItem.innerHTML, txtSearch.Text, vbTextCompare) > 0 Then
                MsgBox "Item found.", vbInformation, "Found It"
            End If
        Next
    End If
End Sub
0
 
LVL 7

Expert Comment

by:Burbble
ID: 12277621
Instead of using a Web Browser control, you might prefer the Microsoft Internet Transfer control (Inet).

Private Sub Form_Load()
    Dim strHTML As String
    strHTML = Inet1.OpenURL("http://www.experts-exchange.com", icString)
    MsgBox strHTML
End Sub

-Burbble
0
 

Author Comment

by:JoseDavila
ID: 12278215
BlueDevilFan,

I would like the output to be in a textfile not a message box.
0
 

Author Comment

by:JoseDavila
ID: 12278370
I would like the search to look into the html source for # and return the following entry into a text file. (Example
<td bgcolor="red"> 4.793E+01 <B>#</B></td>)
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 76

Expert Comment

by:David Lee
ID: 12278471
JoseDavila,

Can you be more specific about what you want returned?  Also, do you want to search for just the first # sign, or all pound signs on the page?  
0
 

Author Comment

by:JoseDavila
ID: 12278511
Search for all pound signs on the page and return the number in the following example (<td bgcolor="red"> 4.793E+01 <B>#</B></td>)
0
 
LVL 76

Expert Comment

by:David Lee
ID: 12278653
Can you give me a URL for a page I can test against?  If not, can you copy and paste the HTML for a sample page here?
0
 

Author Comment

by:JoseDavila
ID: 12278743
That all I am able to provide.
0
 
LVL 7

Expert Comment

by:Burbble
ID: 12281634
>> I would like the output to be in a textfile not a message box.

I was just giving an alternative solution to using a Web Browser Control for retrieving a file from a URL -- the Microsoft Internet Transfer Control's .OpenURL method.

I don't entirely understand how you want to parse it, though...

-Burbble
0
 

Author Comment

by:JoseDavila
ID: 12282014
So I need load the webpage that needs parse and have my code find where every # exists in the web page.  (Example of HTML source from table (<td bgcolor="red"> 4.793E+01 <B>#</B></td>)) Once the code has found # char I would like it to recode the number (Which in this example is 4.793E+01) and write the output to a text file.
0
 

Author Comment

by:JoseDavila
ID: 12282613
Does anybody have input on how to make this code work.  'Input as to where the item was found in the html source and write the text 10 character to left to a text file.

Private Sub cmdGo_Click()
    Dim objDocument As Object, _
        objElements As Object, _
        objItem As Object
    Dim myResults As String

    If txtSearch.Text <> "" Then
        WebBrowser1.Navigate "http://www.yahoo.com/"
            DoEvents
        Wend
        Set objDocument = WebBrowser1.Document
        Set objElements = objDocument.All
        For Each objItem In objElements
            If InStr(1, objItem.innerHTML, txtSearch.Text, vbTextCompare) > 0 Then
                myResults = 'Need Input as to where the item was found and record the text 10 character to left & vbClrf
                Open "C:\output.txt" For Output As #1
                Write #1, myResults
                Close #1
            End If
        Next
    End If
End Sub
0
 
LVL 76

Expert Comment

by:David Lee
ID: 12282725
You've left out part of the code, JoseDavila.  The way it is now it probably won't work because you're not letting the web page load before it moves on to searching for the text you want.  Also, the way you've set the code up to write the output to the text file you're only going to get the alst itteration, not every itteration.  Hang on a minute and I'll modify my post to search for what you've asked for.  Note though that since I don't have an example to test with I can't say for certain it'll work as you want.
0
 
LVL 76

Accepted Solution

by:
David Lee earned 500 total points
ID: 12282829
Try this.  Without a working example of the page to test with the best I can do is to grab the entire line of any line containing a # in the text portion.  Unless you can assure me that evey line with a # in it is going to be formatted exactly like the example you gave (<td bgcolor="red"> 4.793E+01 <B>#</B></td>)) I can't reliable parse out the number portion.  When you change the URL below to your URL do not include the http:// portion.

Private Sub cmdGo_Click()
    Dim objDocument As Object, _
        objElements As Object, _
        objItem As Object, _
        lngLineCount As Long
    WebBrowser1.Navigate "www.usatoday.com"
    While WebBrowser1.ReadyState <> READYSTATE_COMPLETE
        DoEvents
    Wend
    Open "C:\Output.Txt" For Output As #1
    Set objDocument = WebBrowser1.Document
    Set objElements = objDocument.All
    lngLineCount = 1
    For Each objItem In objElements
        If InStr(1, objItem.innertext, "#", vbTextCompare) > 0 Then
            Print #1, "Line" & lngLineCount & "=" & objItem.innertext
        End If
        lngLineCount = lngLineCount + 1
    Next
    Close #1
End Sub
0
 
LVL 10

Expert Comment

by:anv
ID: 12283786
hi

for following

>> For Each objItem In objElements
>>            If InStr(1, objItem.innerHTML, txtSearch.Text, vbTextCompare) > 0 Then
>>                myResults = 'Need Input as to where the item was found and record the >>text 10 character to left & vbClrf
>>                Open "C:\output.txt" For Output As #1
>>                Write #1, myResults
>>                Close #1
>>            End If
>>        Next

try this

For Each objItem In objElements
   ind=InStr(1, objItem.innerHTML, txtSearch.Text, vbTextCompare)
  If ind > 0 then                      
         myResults =  ind &  Mid(objItem.innerHTML, ind - 10, 10) &  vbClrf
         Open "C:\output.txt" For Output As #1
         Write #1, myResults
         Close #1
   End If
Next

0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Introduction While answering a recent question about filtering a custom class collection, I realized that this could be accomplished with very little code by using the ScriptControl (SC) library.  This article will introduce you to the SC library a…
You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now