Link to home
Start Free TrialLog in
Avatar of yarek
yarekFlag for France

asked on

delphi spider website

I need a code or component (might b efree or commercial)  that :
- copy an entire (dynamic or not) website to the local hard disk
- copying must be multi thread

Structure of the website should be preserved
Avatar of TheRealLoki
TheRealLoki
Flag of New Zealand image

You can do this by using a TWebBrowser to save is as a .mht (includes images, so you can email the page as 1 file etc)
(example halfway down the page)
http://delphi.about.com/od/internetintranet/l/aa062904a.htm

There is an "internet explorer" call you can do which will do the same as "save as" does in IE (creates a directory and puts all the images etc in it along with the main html)
I can't find the code until I get home, but maybe someone else here has it
Avatar of yarek

ASKER

not good: it mist be multihread nad grab a WHOLE website, not only a webpage
Well, you can use the Chilkat Spider ActiveX, a free ActiveX component and their sample:
http://www.example-code.com/vb/spiderSite.asp (VB Sample source)

Or felix Colibri's fine Delphi code sample and article:
spider/web_spider.html" target="_blank" onclick="return openNew(this.href);">http://www.felix-colibri.com/papers/web/web_spider/web_spider.html
Avatar of yarek

ASKER

yes I know this link but it is not MULTI THREAD : it spiders page after page.
that is the ONLY Delphi spider code I have found in 5 yrs of looking.
If you want it multi-thread, do it yourself.
Avatar of yarek

ASKER

I have tried the Chilkat Spider ActiveX : it saves some .DAT files with corrupted headers and works pretty bad: freezes PC for a while...
ASKER CERTIFIED SOLUTION
Avatar of ginsonic
ginsonic
Flag of Romania image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of yarek

ASKER

ALWebSpider is excellent.. except, it does not run with Delphi 6 !
Have you test it? I know that is specified that is for D7 , but this don't mean that can't work on D6.
Avatar of yarek

ASKER

I have tested it and there are some functions that cannot be compiled
like valuefromindex...
maybe you can try... +2000 pts more
can you run the demo\ALWebSpider\ALWebSpider.exe? does this do everything you need?
Avatar of yarek

ASKER

yes : the .EXE DEMO is good.
But the ALWebSpider component does not install in D6: there are a few D7 specific functions inside: I tried to adapt it (very quickly) to D6 and It did install but running the demo,  only the first page page was spired : I believe I have done it too quickly

There are mainly 2 errors:

an error about TdateTime with regional parameter (do not remember the error)
and an error about TstringList.valuefromIndex property that does not exist in D6 and that I tried to translated into a nested .values propetries that are recoginzed in D6...

->BUT I FAILED

I think it must be a piece of cake for someone who has already transalted some D7 to D6
Thanks
The ValueFromIndex of Delphi 7 looks like the following

function TStrings.GetValueFromIndex(Index: Integer): string;
begin
  if Index >= 0 then
    Result := Copy(Get(Index), Length(Names[Index]) + 2, MaxInt) else
    Result := '';
end;

procedure TStrings.SetValueFromIndex(Index: Integer; const Value: string);
begin
  if Value <> '' then
  begin
    if Index < 0 then Index := Add('');
    Put(Index, Names[Index] + NameValueSeparator + Value);
  end
  else
    if Index >= 0 then Delete(Index);
end;


For Delphi 5, 6, etc you could write a function like the following to do the same thing in
(put these just after the "implementation" line in any units that need it, or use a shared unit)

function GetValueFromIndex(S: TStrings; Index: Integer): string;
begin
  if Index >= 0 then
    Result := Copy(S[Index], Length(S.Names[Index]) + 2, MaxInt) else
    Result := '';
end;

procedure SetValueFromIndex(S: TStrings; Index: Integer; const Value: string);
begin
  if Value <> '' then
  begin
    if Index < 0 then Index := S.Add('');
    S[Index] := S.Names[Index] + '='{NameValueSeparator} + Value;
  end
  else
    if Index >= 0 then S.Delete(Index);
end;

so instead of saying
label1.caption := S.ValueFromIndex(2);
you use
label1.caption := GetValueFromIndex(S, 2);

instead of saying
S.ValueFromIndex(2) := 'hello';
you use
SetValueFromIndex(S, 2, 'hello')

hope this helps.
let me know in more detail what the other error was;
Avatar of yarek

ASKER

ok, I will try to compile it again and will send ALL ERRORS.
maybe the simplest would be that you do compile it in D6
I can't even run the .exe demo on my pc. complains about a missing winhttp.dll, which is why I asked if you could :-) Prefer to just help you D6 it if that's ok
I started to modify. Still get problems with GetLocaleFormatSettings
What problems are you having with GetLocaleFormatSettings?
Avatar of yarek

ASKER

GetLocaleFormatSettings : is not an DELPHI6 function.

I simply deleted this line and that is maybe why this component does not work properly anymore.
It was added in D7 in sysUtils. It sets the FormatSetting identifiers, i.e. ShortDateFormat,
based on the Windows Locale settings.

Not difficult to write this function yourself.
I solve already this  function. But after that a lots of new function wait after that. All from D7. How I see D7 get a lots of functions from C++ version.