Solved

WEB RIPPER

Posted on 2003-12-08
10
307 Views
Last Modified: 2010-05-03
I am trying to make a web ripper in VB 6.0...what i need it to do is the following...

http://www.codeshq.com/showthread.php?threadid=6116

first rip that site..l.take all the information off of it...no need for pics...just the text...i dont care if it looks ugly or not...preferably take just the codes...but really it doesnt matter...if you cann make it take just the codes i give extra points...anyways...after that i need it to rip

http://www.codeshq.com/showthread.php?threadid=6117
then
http://www.codeshq.com/showthread.php?threadid=6118
then
http://www.codeshq.com/showthread.php?threadid=6119

and i need it to go all the way to
http://www.codeshq.com/showthread.php?threadid=6843

so basically just keep making the last number go up by one...do u understand...if not post a question...lots of points for this...

thanks in advanced...

Chris
0
Comment
Question by:DIDD0
  • 6
  • 4
10 Comments
 

Author Comment

by:DIDD0
ID: 9901598
if possible make a HTML page with the name of the game so people can click the game and see the codes!
0
 
LVL 14

Accepted Solution

by:
aelatik earned 365 total points
ID: 9902537
Paste the following code in a new project, hit run and see your folder grow with all the info you need. I something needs to be modified just ask...

Const FolderToSave As String = "c:\temp\"

Private Sub Form_Load()
    Dim I As Long
    For I = 6117 To 6843
        FetchPage I
    Next
End Sub

Private Function FetchPage(ID As Long)
    Dim IE As Object: Dim Content As String
    Set IE = CreateObject("internetexplorer.application")
        IE.Visible = False ' True if you wish to see it all happen
        IE.navigate "http://www.codeshq.com/showthread.php?threadid=" & ID
  While IE.busy: DoEvents: Wend
        ' There are two methods to retrieve document info,
        ' innerHTML which only show the displayed text and
        ' outerHTML which show the entire HTML code behind the page.
        Content = IE.document.documentelement.innertext
        ' Content = IE.document.documentelement.outerhtml
        IE.quit
       
        Dim POS1 As Long: Dim POS2 As Long: Dim TITLE As String
        POS1 = InStr(1, Content, "Game Genie codes >", vbTextCompare) + Len("Game Genie codes >") + 1
        POS2 = InStr(POS1, Content, vbCrLf)
        TITLE = Mid(Content, POS1, POS2 - POS1)
        POS1 = InStr(1, Content, "Posts:", vbTextCompare) + 5
        POS1 = InStr(POS1, Content, vbCrLf) + 2
        POS2 = InStr(POS1, Content, "__________________")
        Content = Mid(Content, POS1, POS2 - POS1)
   
        Open FolderToSave & TITLE & ".txt" For Output As #1
            Print #1, Content
        Close #1
        DoEvents
End Function
0
 

Author Comment

by:DIDD0
ID: 9903707
I get an error...

Runtime error

sometimes i get this one:
Content = Mid(Content, POS1, POS2 - POS1)

and most of the time i get an error on this

  While IE.busy: DoEvents: Wend

any help??? thanks
0
 
LVL 14

Expert Comment

by:aelatik
ID: 9903954
How many did it fetch before the error ? Take a look at the last file the view the results
0
 
LVL 14

Expert Comment

by:aelatik
ID: 9904019
This part of the code is ment to extract the codes from the page, try without it :

       Dim POS1 As Long: Dim POS2 As Long: Dim TITLE As String
        POS1 = InStr(1, Content, "Game Genie codes >", vbTextCompare) + Len("Game Genie codes >") + 1
        POS2 = InStr(POS1, Content, vbCrLf)
        TITLE = Mid(Content, POS1, POS2 - POS1)
        POS1 = InStr(1, Content, "Posts:", vbTextCompare) + 5
        POS1 = InStr(POS1, Content, vbCrLf) + 2
        POS2 = InStr(POS1, Content, "__________________")
        Content = Mid(Content, POS1, POS2 - POS1)

If the page does not contain the words "Game Genie codes >" or "Posts:" it could give errors
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 

Author Comment

by:DIDD0
ID: 9907031
thew main error i get is with the while ie.busy: do events: wend

the error is "automation error"
0
 

Author Comment

by:DIDD0
ID: 9907157
uym like 2 at most...usually just 1
0
 
LVL 14

Expert Comment

by:aelatik
ID: 9907532
When i run this code it goes endlessly end stores every pages code to the harddrive. I don't get it why i doesn't work out for you. Try setting "IE.Visible = False"  to "IE.Visible = True" then you can see what happens. Amd try testing it afer compilation, maybe the errors are raised within the VB IDE.
0
 

Author Comment

by:DIDD0
ID: 9908236
i think it is my internet doesnt load fast enough...o well i will try it at school...although my internet is 3000 megabits/sec so im not sure...anyways my friend could probably do it...il try it in some places and report back to you...thanks man...u get the points...dont worry!
0
 

Author Comment

by:DIDD0
ID: 9908573
ok i made it work
here is my code

Dim I As Long

Private Sub Form_Load()
I = 6116
End Sub

Private Sub Timer1_Timer()
    If I = 6844 Then
        MsgBox "Done", , "Message"
    Else
        I = I + 1
        FetchPage I
    End If
End Sub

Private Function FetchPage(ID As Long)

On Error GoTo ErrHandler

    Dim IE As Object: Dim Content As String
    Set IE = CreateObject("internetexplorer.application")
        IE.Visible = True ' True if you wish to see it all happen
        IE.navigate "http://www.codeshq.com/showthread.php?threadid=" & ID
  While IE.busy: DoEvents: Wend
        ' There are two methods to retrieve document info,
        ' innerHTML which only show the displayed text and
        ' outerHTML which show the entire HTML code behind the page.
        Content = IE.document.documentelement.innertext
        ' Content = IE.document.documentelement.outerhtml
        IE.quit
       
        Dim POS1 As Long: Dim POS2 As Long: Dim TITLE As String
        POS1 = InStr(1, Content, "Game Genie codes >", vbTextCompare) + Len("Game Genie codes >") + 1
        POS2 = InStr(POS1, Content, vbCrLf)
        TITLE = Mid(Content, POS1, POS2 - POS1)
        POS1 = InStr(1, Content, "Posts:", vbTextCompare) + 5
        POS1 = InStr(POS1, Content, vbCrLf) + 2
        POS2 = InStr(POS1, Content, "__________________")
        Content = Mid(Content, POS1, POS2 - POS1)
   
        Open "c:\temp\" & TITLE & ".txt" For Output As #1
            Print #1, Content
        Close #1
        DoEvents
Exit Function

ErrHandler:
End Function
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Introduction In a recent article (http://www.experts-exchange.com/A_7811-A-Better-Concatenate-Function.html) for the Excel community, I showed an improved version of the Excel Concatenate() function.  While writing that article I realized that no o…
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now