Solved

Who have multi-thread Spider source code?

Posted on 2001-08-25
6
566 Views
Last Modified: 2010-04-06
Who have Spider source code?(Need Delphi source)
Thanks!
0
Comment
Question by:yuwang
6 Comments
 

Author Comment

by:yuwang
ID: 6427980
No one know???
0
 
LVL 3

Expert Comment

by:rondi
ID: 6428053
what is Spider ?
0
 

Expert Comment

by:djmcrae
ID: 6428663
Rondi - A spider, web-robot, bot etc simply a program that visits a number of Web sites - some search sites use them to index pages, others use them to target shopping sites etc.

yuwang - I once did a little one for fun, but all the code was lifted from this excellent article: http://www.inprise.com/delphi/news/delphi_developer/bolton/ everything you need, including downloadable code.
0
Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

 

Author Comment

by:yuwang
ID: 6430581
It is very good!
But I want get a webpage,and search the webpage'URL,then
get the webpage'url page,and get url...
How to do?
Thanks again!
0
 

Accepted Solution

by:
djmcrae earned 100 total points
ID: 6434711
Damn - I do not have a server at the moment, but I can post you an application. It is not a spider, but a site-saver (developed to start at the home page and follow all links within a dynamic (.asp or .php) site, saving the raw html and renamed graphical links to hard drive so that a dynamic site could be burnt to CD for demo purposes). Give me your email address to dmcrae@hotmail.com and I'll email the source.

But for others that may be following - and for yuwang in the meantime, hope this helps (basically the guts of it).

a TWebBrowser and a button on your form

procedure TForm1.btnGoClick(Sender: TObject);
var Flags: OLEVariant;
  URL: string;
begin
  //start processing the web site
  HTMLLinkCount:= 0;
  Flags:= 4; //navNoReadFromCache=4
  URL:= edWebAddress.Text;
  bFirstFire:= false; //global var
  WB1.Navigate(URL,Flags);
end;

this is the onDocumentComplete event of the TWebBrowser

procedure TForm1.WB1DocumentComplete(Sender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
begin
  if not(bFirstFire) then
  begin
    //set this boolean to prevent multiple firings in frames (we'll get the
    // frame contents seperately)
    if bInFrame then
      bInFrame:= false
    else
    begin
      bFirstFire:= true;
      WB1.Stop;
      ProcessPage;
    end;
  end;
end;

procedure TForm1.ProcessPage;
var Doc: IHTMLDocument2;
  PageAll: IHTMLElementCollection;
  pageItem: OLEVariant;
  k: integer;
begin
  Doc:= wb1.document as IHTMLDocument2;
  PageAll:= Doc.all;
  //showmessage(PageAll.toString);
  //showmessage(IntToStr(iCurrentParentLink)+' '+Doc.url);
  //this delay is a bodge to stop some framesets refiring
  //it may not be necessay
  Delay(300); //delay is a very handy utility from RxLib
  if CompareText(Copy(Doc.URL,1,6),'res://')=0 then
  begin
    //page is busted - not found
  end
  else begin
    bInFrame:= false;
    for k:= 0 to PageAll.Length-1 do
    begin
      pageItem:= pageAll.item(k, varEmpty);
      if pageItem.tagname='FRAME' then
      begin
        ProcessPageSaveLink(iCurrentParentLink, pageItem.src, 'FRAME');
        bInFrame:= true;
      end;
      if pageItem.tagname='IMG' then
      begin
        ProcessPageSaveLink(iCurrentParentLink, pageItem.src, 'IMG');
      end;
....
    end;
  end;
  //HTMLLinkArray[iCurrentParentLink].URLtoFollow := Doc.URL;
  HTMLLinkArray[iCurrentParentLink].isProcessed:= true;
  ProcessNextLink;
end

ProcessPageSaveLink just adds the link and link type to a dynamic array
the processNextLink reads any unfollowed links in the array, sets its followed flag, and points the TWebBrowser at this link to fetch (unless it is an image or document, then I use a TNMHTTP1 (this may not be in D5 pro - I have D5 ent - in that case, the Indy one may even be a better choice) to save it on to the hard drive).

Anything that need clearing up, just yell.
Apololgies for the delays, being in Australia, I'm probably asleep when you're up and vice versa.


0
 
LVL 17

Expert Comment

by:geobul
ID: 9288324
No comment has been added lately, so it's time to clean up this TA.
I will leave a recommendation in the Cleanup topic area that this question is:

accept djmcrae's comment as answer

Please leave any comments here within the next seven days.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

Thanks,

geobul
EE Cleanup Volunteer
0

Featured Post

Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
add combobox item based on numbers 9 140
select query - oracle 16 100
Controlled Assessment GCSE - desperate help needed 4 84
firemonkey keyboard covers the controls 1 25
Objective: - This article will help user in how to convert their numeric value become words. How to use 1. You can copy this code in your Unit as function 2. than you can perform your function by type this code The Code   (CODE) The Im…
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

816 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now