WEB RIPPER

I am trying to make a web ripper in VB 6.0...what i need it to do is the following...

http://www.codeshq.com/showthread.php?threadid=6116

first rip that site..l.take all the information off of it...no need for pics...just the text...i dont care if it looks ugly or not...preferably take just the codes...but really it doesnt matter...if you cann make it take just the codes i give extra points...anyways...after that i need it to rip

http://www.codeshq.com/showthread.php?threadid=6117
then
http://www.codeshq.com/showthread.php?threadid=6118
then
http://www.codeshq.com/showthread.php?threadid=6119

and i need it to go all the way to
http://www.codeshq.com/showthread.php?threadid=6843

so basically just keep making the last number go up by one...do u understand...if not post a question...lots of points for this...

thanks in advanced...

Chris
DIDD0Asked:
Who is Participating?
 
aelatikCommented:
Paste the following code in a new project, hit run and see your folder grow with all the info you need. I something needs to be modified just ask...

Const FolderToSave As String = "c:\temp\"

Private Sub Form_Load()
    Dim I As Long
    For I = 6117 To 6843
        FetchPage I
    Next
End Sub

Private Function FetchPage(ID As Long)
    Dim IE As Object: Dim Content As String
    Set IE = CreateObject("internetexplorer.application")
        IE.Visible = False ' True if you wish to see it all happen
        IE.navigate "http://www.codeshq.com/showthread.php?threadid=" & ID
  While IE.busy: DoEvents: Wend
        ' There are two methods to retrieve document info,
        ' innerHTML which only show the displayed text and
        ' outerHTML which show the entire HTML code behind the page.
        Content = IE.document.documentelement.innertext
        ' Content = IE.document.documentelement.outerhtml
        IE.quit
       
        Dim POS1 As Long: Dim POS2 As Long: Dim TITLE As String
        POS1 = InStr(1, Content, "Game Genie codes >", vbTextCompare) + Len("Game Genie codes >") + 1
        POS2 = InStr(POS1, Content, vbCrLf)
        TITLE = Mid(Content, POS1, POS2 - POS1)
        POS1 = InStr(1, Content, "Posts:", vbTextCompare) + 5
        POS1 = InStr(POS1, Content, vbCrLf) + 2
        POS2 = InStr(POS1, Content, "__________________")
        Content = Mid(Content, POS1, POS2 - POS1)
   
        Open FolderToSave & TITLE & ".txt" For Output As #1
            Print #1, Content
        Close #1
        DoEvents
End Function
0
 
DIDD0Author Commented:
if possible make a HTML page with the name of the game so people can click the game and see the codes!
0
 
DIDD0Author Commented:
I get an error...

Runtime error

sometimes i get this one:
Content = Mid(Content, POS1, POS2 - POS1)

and most of the time i get an error on this

  While IE.busy: DoEvents: Wend

any help??? thanks
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
aelatikCommented:
How many did it fetch before the error ? Take a look at the last file the view the results
0
 
aelatikCommented:
This part of the code is ment to extract the codes from the page, try without it :

       Dim POS1 As Long: Dim POS2 As Long: Dim TITLE As String
        POS1 = InStr(1, Content, "Game Genie codes >", vbTextCompare) + Len("Game Genie codes >") + 1
        POS2 = InStr(POS1, Content, vbCrLf)
        TITLE = Mid(Content, POS1, POS2 - POS1)
        POS1 = InStr(1, Content, "Posts:", vbTextCompare) + 5
        POS1 = InStr(POS1, Content, vbCrLf) + 2
        POS2 = InStr(POS1, Content, "__________________")
        Content = Mid(Content, POS1, POS2 - POS1)

If the page does not contain the words "Game Genie codes >" or "Posts:" it could give errors
0
 
DIDD0Author Commented:
thew main error i get is with the while ie.busy: do events: wend

the error is "automation error"
0
 
DIDD0Author Commented:
uym like 2 at most...usually just 1
0
 
aelatikCommented:
When i run this code it goes endlessly end stores every pages code to the harddrive. I don't get it why i doesn't work out for you. Try setting "IE.Visible = False"  to "IE.Visible = True" then you can see what happens. Amd try testing it afer compilation, maybe the errors are raised within the VB IDE.
0
 
DIDD0Author Commented:
i think it is my internet doesnt load fast enough...o well i will try it at school...although my internet is 3000 megabits/sec so im not sure...anyways my friend could probably do it...il try it in some places and report back to you...thanks man...u get the points...dont worry!
0
 
DIDD0Author Commented:
ok i made it work
here is my code

Dim I As Long

Private Sub Form_Load()
I = 6116
End Sub

Private Sub Timer1_Timer()
    If I = 6844 Then
        MsgBox "Done", , "Message"
    Else
        I = I + 1
        FetchPage I
    End If
End Sub

Private Function FetchPage(ID As Long)

On Error GoTo ErrHandler

    Dim IE As Object: Dim Content As String
    Set IE = CreateObject("internetexplorer.application")
        IE.Visible = True ' True if you wish to see it all happen
        IE.navigate "http://www.codeshq.com/showthread.php?threadid=" & ID
  While IE.busy: DoEvents: Wend
        ' There are two methods to retrieve document info,
        ' innerHTML which only show the displayed text and
        ' outerHTML which show the entire HTML code behind the page.
        Content = IE.document.documentelement.innertext
        ' Content = IE.document.documentelement.outerhtml
        IE.quit
       
        Dim POS1 As Long: Dim POS2 As Long: Dim TITLE As String
        POS1 = InStr(1, Content, "Game Genie codes >", vbTextCompare) + Len("Game Genie codes >") + 1
        POS2 = InStr(POS1, Content, vbCrLf)
        TITLE = Mid(Content, POS1, POS2 - POS1)
        POS1 = InStr(1, Content, "Posts:", vbTextCompare) + 5
        POS1 = InStr(POS1, Content, vbCrLf) + 2
        POS2 = InStr(POS1, Content, "__________________")
        Content = Mid(Content, POS1, POS2 - POS1)
   
        Open "c:\temp\" & TITLE & ".txt" For Output As #1
            Print #1, Content
        Close #1
        DoEvents
Exit Function

ErrHandler:
End Function
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.