Problem with getting HTML with Indy

I have a redirect problem here. google.com redirects me to google.ro as I am from Romania.
Tell me how could I get the content of google.com with idHTTP. If I set the handle redirects to true it works ok but I get the content of google.ro. I have another component that gets me the right content from google.com.

Tell me if it`s possible.
I need it ASAP.

THANKS,
crystyanAsked:
Who is Participating?
 
2266180Commented:
well .. if you look at the protocol, its https so it requires SSL. you will need to add SSL support to your application if you want to access that page.
I;ve done a login example with ssl for ebay here: http://www.ciuly.com/delphi/indy/delphiIndySSL_ebay/index.html
0
 
2266180Commented:
you should have a 'Google.com in English' link on your google.ro page (localized pages have this).
so instead of getting google.com, get http://www.google.com/ncr
AND keep the cookies.
0
 
crystyanAuthor Commented:
but, how do I set that redirect to false or true ?
I mean if I set it to true and go to microsoft.com I get wrong result. If I set it to false I get good results. but if it`s set to false I don`t get good results from google.com anymore.

thanks ciuly!
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
2266180Commented:
hm.. can you post a small test-code?
0
 
crystyanAuthor Commented:
 idHTTP := TidHTTP.Create(nil);
  idCookieManager := TIdCookieManager.Create(idHTTP);
  idAntiFreeze := TIdAntiFreeze.Create(idHTTP);

  idHTTP.CookieManager := idCookieManager;
  idHTTP.AllowCookies := True;
  idHTTP.HandleRedirects := False;
  Cookies := TStringList.Create;
  HTML := idHTTP.Get(url);
  showmessage(html);
  GetCookies;
  ShowMessage(cookies.Text);

this is a function (well it`s a class but I cut the code and put it together)

  site.GetHTML('http://www.microsoft.com/');


basicly I want to make a class to get or post html, handle the redirects and maybe the cookies.
0
 
2266180Commented:
looks ok except the redirect part. you should enable it while you get the cookies (didn't test if it works ok)

here is a small test-code I just wrote:

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, IdCookieManager, IdBaseComponent, IdComponent,
  IdTCPConnection, IdTCPClient, IdHTTP;

type
  TForm1 = class(TForm)
    IdHTTP1: TIdHTTP;
    IdCookieManager1: TIdCookieManager;
    Memo1: TMemo;
    procedure FormCreate(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.FormCreate(Sender: TObject);
var s:string;
    i:integer;
    cookies:tstringlist;

 procedure setcookies;
 var j:integer;
 begin
   for j:=1 to IdCookieManager1.CookieCollection.count do
     IdHTTP1.Request.RawHeaders.Add('Cookie'+IdHTTP1.Request.RawHeaders.NameValueSeparator+IdCookieManager1.CookieCollection.Items[j-1].CookieText);
 end;

begin
  try
    cookies:=tstringlist.Create;

    s:=IdHTTP1.Get('http://www.google.com/ncr');// first get (for cookies)

    for i:=1 to IdCookieManager1.CookieCollection.count do// save cookies
      cookies.Add(IdCookieManager1.CookieCollection.Items[i-1].CookieText);

    s:=IdHTTP1.Get('http://www.google.com');// normally work with google.ocm from now on
    showmessage(s);
    cookies.free;
  except on e: EIdHTTPProtocolException do
    begin
      showmessage(idHTTP1.response.ResponseText);
    end;
  end;
end;

end.

if you want to make a generic class, you will need to write a mini-webbrowser and follow http protocol. (and some html sinc ethere can be software redirects, from scripts)

I usually prefer to do my site-specific coding, site-specific :) I am not saying that it cannot be done a generic class, just that it is too hard and for me it doesn't worth it.

but usually, handleredirects = true should work ok for most sites, but for google it's a specific case, since it is you that you want to work with google.com and thus overriding the redirect ;) (no browser does that :) )
0
 
crystyanAuthor Commented:
do u know why do I get "IO HANDLER VALUE IS INVALID" when I`m trying to get the html from 'https://login.yahoo.com/config/login/' ?

thanks
0
 
crystyanAuthor Commented:
hi ciuly,

I`m still having probs with the login at del.icio.us ! :(( I`ve spent all my day to look on the ebay project (u did that for me too). I was hoping you to have time to see what`s happening there.
I`m trying to do this:
  HTML := idHTTP.Get('http://del.icio.us/');
  GetCookies;
  SetCookies;
  HTML := idHTTP.Get('https://secure.del.icio.us/login');
and here I get the "IOHandler value is Invalid'.

thanks!
0
 
2266180Commented:
I'll check it in about 10-12 hours. btw, I don't see you get any cookies from https://secure.del.icio.us/login . I would first make sure that it doesn't set any. have you checked that?
if still not working, I'll give it.
0
 
crystyanAuthor Commented:
nope. I just can`t the content of https://secure.del.icio.us/login . I`ve tried all the possibilities...except the good one lol.
0
 
2266180Commented:
well, in this case I'll get back to you in about 10-12 hours. probably with the good solution :)
0
 
crystyanAuthor Commented:
thanks a lot!
0
 
2266180Commented:
hm.. this one was short. you probably didn't notice the software redirect?

here is the code:

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, IdBaseComponent, IdComponent, IdTCPConnection, IdTCPClient,
  IdHTTP, IdCookieManager, StdCtrls, IdServerIOHandler, IdSSLOpenSSL,
  IdIOHandler, IdIOHandlerSocket;

type
  TForm1 = class(TForm)
    IdHTTP1: TIdHTTP;
    IdCookieManager1: TIdCookieManager;
    Memo1: TMemo;
    IdSSLIOHandlerSocket1: TIdSSLIOHandlerSocket;
    procedure FormCreate(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.FormCreate(Sender: TObject);
var
  Params: TStringList;
  HTML, loginurl, signinurl, userid: String;
  count,i:integer;
  cookies:tstringlist;

   procedure setcookies;
   var j:integer;
   begin
       count:=IdCookieManager1.CookieCollection.count;
       for j:=1 to count do
           IdHTTP1.Request.RawHeaders.Add('Cookie'+IdHTTP1.Request.RawHeaders.NameValueSeparator+IdCookieManager1.CookieCollection.Items[j-1].CookieText);
   end;

begin
  signinurl:='http://del.icio.us/';
  // the above is used to get the login page (this is the link from the "sign in" link.
  // you have to emulate a browser, so you need to do all steps. this is a good idea to do
  // since all redirects might set cookies that you will probably need

  loginurl:='https://secure.del.icio.us/login';
  // the above is the login url. this is the url from the action property of the form; this is where
  // the login request will be sent

  Params := TStringList.Create;
  try
    cookies:=tstringlist.Create;

    html:=idhttp1.Get(signinurl);// first get; get first cookie(s)
    // this sets 1 cookie

    count:=IdCookieManager1.CookieCollection.count;// get them
    for i:=1 to count do
     cookies.Add(IdCookieManager1.CookieCollection.Items[i-1].CookieText);

    // you might want to parse the hidden inputs name and value
    // because hard-coding them might not work in the future or in case there are
    // values that are generated

    // no hidden inputs at this time

    userid:=<your user id here>;
    Params.Values['user_name'] := userid;
    Params.Values['password'] := <your password here>;

    setCookies;
    HTML := IdHTTP1.Post(loginurl, Params);// now do the log in

//    if pos('<meta http-equiv="refresh" content="0; URL=http://del.icio.us/'+userid+'"',html)
    setCookies;
    html:=idhttp1.Get('http://del.icio.us/'+userid);// software redirect

    if pos('<title>del.icio.us/'+userid+'</title>',html)>0 then
    begin  // we are logged in
      showmessage('logged in');
    end               else
      showmessage('login failed');

  except
    on e: EIdHTTPProtocolException do
    begin
      memo1.lines.add(idHTTP1.response.ResponseText);
      memo1.lines.add(e.ErrorMessage);
    end;
  end;
  Params.Free;
  memo1.Lines.Text:=html;
end;

end.

works like a charm (I modified the ebay demo)

just in case you didn't know this, you should read this: http://www.indyproject.org/Sockets/SSL.en.aspx (I also updated my ebay demo page to point this out)

cheers
0
 
crystyanAuthor Commented:
lol .... I didn`t associate the SSL Handler to IdHttp. me dumb again!
0
 
crystyanAuthor Commented:
something is still weird here :(((((((((((

I`m doing this:
    HTML := idHTTP.Get('http://del.icio.us/');
    for i:=1 to IdCookieManager.CookieCollection.count do
     cookies.Add(IdCookieManager.CookieCollection.Items[i-1].CookieText);
     ShowMessage(cookies.Text);

and I can`t get all the cookies! though it said I`m connected, I don`t have all the cookies and when I`m tring to do something it redirects me to the login page :(((
I`ve looked with a sniffer and saw there are more cookies than I get.

do u have any ideea ?
0
 
2266180Commented:
yes. some sites hide the cookies in resources to make sure bots don't get thre. since robotx/crawlers will mostly never load resources (images, sounds, etc) those cookies will not be set. check with the sniffer exactly which resource is setting teh cookies and load it yourself
0
 
crystyanAuthor Commented:
how do I know who sets a cookie ?
0
 
2266180Commented:
I just told you: "check with the sniffer exactly which resource is setting teh cookies and load it yourself"
each resource will be loaded with a different http get command so it should be easy to spot
0
 
crystyanAuthor Commented:
could u try looking on my other question ? plssssssssssss
I know that I~m being a pain here :|

Thanks
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.