?
Solved

Who have multi-thread Spider source code?

Posted on 2001-08-25
6
Medium Priority
?
571 Views
Last Modified: 2010-04-06
Who have Spider source code?(Need Delphi source)
Thanks!
0
Comment
Question by:yuwang
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 

Author Comment

by:yuwang
ID: 6427980
No one know???
0
 
LVL 3

Expert Comment

by:rondi
ID: 6428053
what is Spider ?
0
 

Expert Comment

by:djmcrae
ID: 6428663
Rondi - A spider, web-robot, bot etc simply a program that visits a number of Web sites - some search sites use them to index pages, others use them to target shopping sites etc.

yuwang - I once did a little one for fun, but all the code was lifted from this excellent article: http://www.inprise.com/delphi/news/delphi_developer/bolton/ everything you need, including downloadable code.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:yuwang
ID: 6430581
It is very good!
But I want get a webpage,and search the webpage'URL,then
get the webpage'url page,and get url...
How to do?
Thanks again!
0
 

Accepted Solution

by:
djmcrae earned 400 total points
ID: 6434711
Damn - I do not have a server at the moment, but I can post you an application. It is not a spider, but a site-saver (developed to start at the home page and follow all links within a dynamic (.asp or .php) site, saving the raw html and renamed graphical links to hard drive so that a dynamic site could be burnt to CD for demo purposes). Give me your email address to dmcrae@hotmail.com and I'll email the source.

But for others that may be following - and for yuwang in the meantime, hope this helps (basically the guts of it).

a TWebBrowser and a button on your form

procedure TForm1.btnGoClick(Sender: TObject);
var Flags: OLEVariant;
  URL: string;
begin
  //start processing the web site
  HTMLLinkCount:= 0;
  Flags:= 4; //navNoReadFromCache=4
  URL:= edWebAddress.Text;
  bFirstFire:= false; //global var
  WB1.Navigate(URL,Flags);
end;

this is the onDocumentComplete event of the TWebBrowser

procedure TForm1.WB1DocumentComplete(Sender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
begin
  if not(bFirstFire) then
  begin
    //set this boolean to prevent multiple firings in frames (we'll get the
    // frame contents seperately)
    if bInFrame then
      bInFrame:= false
    else
    begin
      bFirstFire:= true;
      WB1.Stop;
      ProcessPage;
    end;
  end;
end;

procedure TForm1.ProcessPage;
var Doc: IHTMLDocument2;
  PageAll: IHTMLElementCollection;
  pageItem: OLEVariant;
  k: integer;
begin
  Doc:= wb1.document as IHTMLDocument2;
  PageAll:= Doc.all;
  //showmessage(PageAll.toString);
  //showmessage(IntToStr(iCurrentParentLink)+' '+Doc.url);
  //this delay is a bodge to stop some framesets refiring
  //it may not be necessay
  Delay(300); //delay is a very handy utility from RxLib
  if CompareText(Copy(Doc.URL,1,6),'res://')=0 then
  begin
    //page is busted - not found
  end
  else begin
    bInFrame:= false;
    for k:= 0 to PageAll.Length-1 do
    begin
      pageItem:= pageAll.item(k, varEmpty);
      if pageItem.tagname='FRAME' then
      begin
        ProcessPageSaveLink(iCurrentParentLink, pageItem.src, 'FRAME');
        bInFrame:= true;
      end;
      if pageItem.tagname='IMG' then
      begin
        ProcessPageSaveLink(iCurrentParentLink, pageItem.src, 'IMG');
      end;
....
    end;
  end;
  //HTMLLinkArray[iCurrentParentLink].URLtoFollow := Doc.URL;
  HTMLLinkArray[iCurrentParentLink].isProcessed:= true;
  ProcessNextLink;
end

ProcessPageSaveLink just adds the link and link type to a dynamic array
the processNextLink reads any unfollowed links in the array, sets its followed flag, and points the TWebBrowser at this link to fetch (unless it is an image or document, then I use a TNMHTTP1 (this may not be in D5 pro - I have D5 ent - in that case, the Indy one may even be a better choice) to save it on to the hard drive).

Anything that need clearing up, just yell.
Apololgies for the delays, being in Australia, I'm probably asleep when you're up and vice versa.


0
 
LVL 17

Expert Comment

by:geobul
ID: 9288324
No comment has been added lately, so it's time to clean up this TA.
I will leave a recommendation in the Cleanup topic area that this question is:

accept djmcrae's comment as answer

Please leave any comments here within the next seven days.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

Thanks,

geobul
EE Cleanup Volunteer
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The uses clause is one of those things that just tends to grow and grow. Most of the time this is in the main form, as it's from this form that all others are called. If you have a big application (including many forms), the uses clause in the in…
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
In this video we outline the Physical Segments view of NetCrunch network monitor. By following this brief how-to video, you will be able to learn how NetCrunch visualizes your network, how granular is the information collected, as well as where to f…
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…
Suggested Courses
Course of the Month8 days, 20 hours left to enroll

764 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question